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ABSTRACT 

The  classic  randomized  load  balancing  model  is  the  so-called  supermarket  model,  which  describes  a  system  in  which 
customers  arrive  to  a  service  center  with  n  parallel  servers  according  to  a  Poisson  process  with  rate  ?n,  where  ?  <  1 . 
Upon  arrival,  each  customer  samples  d  queues  independently  and  uniformly  at  random  before  joining  the  shortest  of 
those  sampled.  Customers  are  served  according  to  a  first-in  first-out  (FIFO)  scheduling  rule,  and  their  service  times 
are  assumed  to  be  mutually  independent  and  exponentially  distributed  with  unit  mean  ?  =  1 .  Any  ties  that  may  occur 
are  broken  randomly.  When  d  =  1,  the  model  reduces  to  a  system  of  n  independent  M/M/1  queues,  for  which  it  is  a 
classical  result  that  the  stationary  queue  length  distribution  at  a  single  queue  is  geometric  with  parameter  ?,  and  thus 
has  an  exponential  decay  rate.  When  d  ?  2,  the  model  is  not  exactly  solvable,  but  asymptotic  results  show  that  as  n, 
the  number  of  servers,  goes  to  infinity,  the  limiting  stationary  distribution  of  a  queue  decays  superexponentially. 
Moreover,  the  majority  of  this  gain  in  perfonnance  is  already  obtained  when  d  =  2.  In  particular,  this  shows  that  with 
just  a  slight  increase  in  sampling  cost,  from  d  =  1  to  d  =  2,  the  perfonnance  is  almost  as  good  as  in  the  case  when  all 
queues  are  sampled  (that  is,  the  Join-the-Shortest-Queue  system  where  d  =  n).  This  phenomenon  is  refened  to  as  the 
“power  of  two  choices,”  and  this  classic  model  is  well  studied. 

With  an  aim  to  further  examine  tradeoffs  between  perfonnance  and  sampling  complexity,  we  examine  a  variation  of 
this  model,  which  we  refer  to  as  the  threshold  supermarket  model.  In  this  model,  upon  anival,  each  customer  samples 
a  single  queue.  If  this  initial  queue  has  length  greater  than  or  equal  to  a  certain  threshold  T,  the  customer  samples  one 
additional  queue  and  joins  the  shorter  of  the  two  (with  ties  being  broken  uniformly  at  random).  On  the  other  hand,  if 
the  initial  queue  has  length  strictly  below  this  threshold  T,  the  customer  simply  joins  the  queue.  We  use  a 
combination  of  tools  from  the  theory  of  Markov  jump  processes,  ordinary  differential  equations,  and  recursive 
equations  to  analyze  the  system  with  arbitrary  threshold  and  quantify  the  tradeoffs  between  the  cost  of  sampling  and 
the  perfonnance  of  the  system.  In  particular,  we  identify  the  socially  optimal  threshold  T?  that  minimizes  a  suitable 
cost  function  that  includes  a  marginal  cost  of  sampling  c  for  each  customer.  We  also  provide  simulations  that 
demonstrate  the  accuracy  of  our  model’s  predictions.  We  find  that  when  the  cost  of  sampling  is  greater  than  zero, 
implementing  a  threshold  policy  to  determine  the  number  of  queues  to  sample  upon  arrival  significantly  reduces  the 
total  cost  to  the  system.  In  fact,  for  a  given  ?  and  marginal  cost  of  sampling,  the  threshold  supermarket  model  always 
outperforms  the  corresponding  supermarket  model  with  zero  threshold.  Finally,  we  also  consider  a  game  associated 
with  the  threshold  supennarket  model.  In  this  game,  each  customer  chooses  a  threshold  to  minimize  her  total  cost, 
when  the  marginal  sampling  cost  is  c'.  For  a  given  “actual”  marginal  cost  of  sampling  c,  we  identify  a  corresponding 
“imposed”  marginal  cost  of  sampling  c'  for  the  game  such  that  the  social  optimum  for  c  coincides  with  the  Nash 
equilibrium  for  c’.  The  difference  c'  ?  c  can  be  viewed  as  a  tax  (or  subsidy)  that  a  system  operator  levies  on 
individuals  so  that  when  each  customer  behaves  selfishly  to  minimize  her  cost,  the  system  converges  to  the  social 
optimum  corresponding  to  the  marginal  sampling  cost  cs.  We  discuss  the  very  interesting  implications  of  these  results 
for  the  design  of  many  load  balancing  systems. 
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Abstract 


The  classic  randomized  load  balancing  model  is  the  so-called  supermarket  model,  which  describes  a 
system  in  which  customers  arrive  to  a  service  center  with  n  parallel  servers  according  to  a  Poisson 
process  with  rate  An,  where  A  <  1.  Upon  arrival,  each  customer  samples  d  queues  independently  and 
uniformly  at  random  before  joining  the  shortest  of  those  sampled.  Customers  are  served  according 
to  a  first-in  first-out  (FIFO)  scheduling  rule,  and  their  service  times  are  assumed  to  be  mutually 
independent  and  exponentially  distributed  with  unit  mean  /i  =  1 .  Any  ties  that  may  occur  are 
broken  randomly.  When  d  =  1,  the  model  reduces  to  a  system  of  n  independent  M/M/1  queues, 
for  which  it  is  a  classical  result  that  the  stationary  queue  length  distribution  at  a  single  queue  is 
geometric  with  parameter  A,  and  thus  has  an  exponential  decay  rate.  When  d  >  2,  the  model  is 
not  exactly  solvable,  but  asymptotic  results  show  that  as  n,  the  number  of  servers,  goes  to  infinity, 
the  limiting  stationary  distribution  of  a  queue  decays  superexponentially.  Moreover,  the  majority 
of  this  gain  in  performance  is  already  obtained  when  d  =  2.  In  particular,  this  shows  that  with  just 
a  slight  increase  in  sampling  cost,  from  d  =  1  to  d  =  2,  the  performance  is  almost  as  good  as  in  the 
case  when  all  queues  are  sampled  (that  is,  the  Join-the-Shortest-Queue  system  where  d  =  n).  This 
phenomenon  is  referred  to  as  the  “power  of  two  choices,”  and  this  classic  model  is  well  studied. 

With  an  aim  to  further  examine  tradeoffs  between  performance  and  sampling  complexity,  we 
examine  a  variation  of  this  model,  which  we  refer  to  as  the  threshold  supermarket  model.  In  this 
model,  upon  arrival,  each  customer  samples  a  single  queue.  If  this  initial  queue  has  length  greater 
than  or  equal  to  a  certain  threshold  T,  the  customer  samples  one  additional  queue  and  joins  the 
shorter  of  the  two  (with  ties  being  broken  uniformly  at  random).  On  the  other  hand,  if  the  initial 
queue  has  length  strictly  below  this  threshold  T,  the  customer  simply  joins  the  queue.  We  use  a 
combination  of  tools  from  the  theory  of  Markov  jump  processes,  ordinary  differential  equations, 
and  recursive  equations  to  analyze  the  system  with  arbitrary  threshold  and  quantify  the  tradeoffs 
between  the  cost  of  sampling  and  the  performance  of  the  system.  In  particular,  we  identify  the 
socially  optimal  threshold  T*  that  minimizes  a  suitable  cost  function  that  includes  a  marginal  cost 
of  sampling  cs  for  each  customer.  We  also  provide  simulations  that  demonstrate  the  accuracy  of 
our  model’s  predictions.  We  find  that  when  the  cost  of  sampling  is  greater  than  zero,  implementing 
a  threshold  policy  to  determine  the  number  of  queues  to  sample  upon  arrival  significantly  reduces 
the  total  cost  to  the  system.  In  fact,  for  a  given  A  and  marginal  cost  of  sampling,  the  threshold 
supermarket  model  always  outperforms  the  corresponding  supermarket  model  with  zero  threshold. 

Finally,  we  also  consider  a  game  associated  with  the  threshold  supermarket  model.  In  this 
game,  each  customer  chooses  a  threshold  to  minimize  her  total  cost,  when  the  marginal  sampling 
cost  is  cs.  For  a  given  “actual”  marginal  cost  of  sampling  cs,  we  identify  a  corresponding  “imposed” 
marginal  cost  of  sampling  cs  for  the  game  such  that  the  social  optimum  for  cs  coincides  with  the 
Nash  equilibrium  for  cs.  The  difference  cs  —  cs  can  be  viewed  as  a  tax  (or  subsidy)  that  a  system 
operator  levies  on  individuals  so  that  when  each  customer  behaves  selfishly  to  minimize  her  cost,  the 
system  converges  to  the  social  optimum  corresponding  to  the  marginal  sampling  cost  cs.  We  discuss 
the  very  interesting  implications  of  these  results  for  the  design  of  many  load  balancing  systems. 
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1  Introduction 

A  common  challenge  for  a  variety  of  applications  in  computer  science  and  engineering  is 
to  devise  simple  algorithms  that  efficiently  balance  load  across  multiple  resources  so  as 
to  maximize  performance.  In  the  simplest  setting,  consider  a  single  stream  of  customers 
arriving  to  a  system  of  n  parallel  queues  with  independent  identical  servers.  The  question 
is  how  to  route  customers  to  queues  so  as  to  achieve  a  small  expected  queue  length  in 
steady  state.  One  load  balancing  algorithm  that  has  been  proposed  is  the  so-called  Join- 
the-Shortest-Queue  (JSQ),  in  which  an  arriving  customer  joins  the  queue  with  the  least 
number  of  packets  (with  ties  broken  uniformly  at  random).  When  the  system  is  stable 
and  the  service  distribution  is  exponential,  this  algorithm  achieves  very  good  performance. 
However,  when  n  is  large,  this  algorithm  can  be  infeasible  or  prohibitively  expensive  to 
implement  as  it  requires  knowledge  of  states  of  n  queues  every  time  a  customer  arrives  to 
the  system. 

An  alternative  that  has  been  proposed  is  the  so-called  supermarket  model.  In  this 
model,  customers  arrive  to  a  service  center  with  n  parallel  servers  according  to  a  Poisson 
process  with  rate  An,  where  A  <  1,  sample  d  of  the  servers  independently  and  uniformly  at 
random  (with  replacement),  and  wait  an  unknown  amount  of  time  in  the  shortest  of  the  d 
queues  sampled  until  served.  Customers  are  served  according  to  a  first-in  first-out  (FIFO) 
service  discipline,  and  their  service  times  are  assumed  independent  and  exponentially  dis¬ 
tributed  with  mean  fj,  =  1.  Any  ties  that  may  occur  are  broken  randomly.  Such  systems 
are  ubiquitous  and  can  be  found  in  a  variety  of  situations,  including  cloud  computing, 
hashing,  data  centers,  resource  allocation,  and  other  service  specific  systems  such  as  banks 
and  grocery  stores. 
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Figure  1:  The  Supermarket  Model  with  d  =  2 

In  order  to  analyze  this  model,  we  represent  the  state  of  the  process  as  a  jump  Markov 
process.  The  jump  Markov  process  corresponding  to  each  n— server  system  can  be  shown 
to  have  a  unique  stationary  distribution  n^n\  which  represents  the  equilibrium  empirical 
queue  length  distribution.  However,  unlike  in  the  case  of  an  M/M/1  queue,  the  distribution 
7r(nl  does  not  admit  an  explicit  analytical  expression.  Instead,  to  gain  insight  into  ir^  for 
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large  n,  we  study  the  weak  limit  it  of  TT^n\  as  n  tends  to  infinity.  This  limit  can  be  shown 
to  exist  and  to  admit  a  characterization  as  the  unique  stable  fixed  point  of  a  countable 
system  of  ordinary  differential  equations  (ODEs).  We  use  Kurtz’s  theorem  to  show  that 
the  latter  system  of  ODEs  describes  the  finite-time  evolution  of  the  limit  of  the  (suitably 
scaled)  sequence  of  jump  Markov  processes  that  model  the  n— server  system.  Figure  2 
illustrates  the  basic  framework  for  our  asymptotic  analysis. 

Figure  2:  Methods  of  Analysis 


1.1  The  Classic  Supermarket  Model 

The  original  “supermarket  model”  was  completely  analyzed  by  Vvedenskaya,  Dobrushin, 
and  Karpelevich  [1]  for  the  case  d  =  2,  and  the  application  of  Kurtz’s  theorem  and  the 
analysis  of  the  system  of  ordinary  differential  equations  (ODEs)  and  their  fixed  points  (see 
Figure  2)  for  arbitrary  d>  2  was  carried  out  by  Mitzenmacher  [2],  For  any  integer  d  >  2, 
the  continuous  version  of  the  dynamics  of  the  supermarket  model  can  be  described  by  the 
following  system  of  ODEs:  for  all  t  >  0, 

|  i^(t)  =  A(sf_i(t)  -  sf(t))  -  ( 8i(t )  -  si+i(t))  for  i  >  1,  ^ 

\  s0(t)  =  1. 

Here,  S{  represents  the  limiting  fraction  of  queues  that  have  no  less  than  i  customers.  The 
quantity  s*  can  decrease  only  when  there  is  a  departure  from  a  queue  with  i  customers. 
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Since  the  fraction  of  queues  with  exactly  i  customers  is  (s*  —  Sj+i)  and  the  service  rate  is 
1,  the  rate  of  decrease  of  st  is  equal  to  (s*  —  s*+i).  On  the  other  hand,  the  quantity  st 
increases  only  when  there  is  an  arrival  into  a  queue  with  i  —  1  customers.  Observing  that 
the  normalized  arrival  rate  of  customers  is  A  and  that  (sf_  1  —  sf)  represents  the  probability 
that  the  shortest  of  d  randomly  sampled  queues  (with  replacement)  has  %  —  1  packets,  we 
then  obtain  the  ODE  in  equation  (1).  The  boundary  condition  so(t)  =  1  trivially  follows 
from  the  fact  that  at  all  times,  every  queue  in  the  system  has  0  or  more  packets. 

It  was  also  shown  in  [1]  and  [2]  that  the  system  of  ODEs  has  a  unique  fixed  point 
{sj}^0,  which  is  obtained  by  setting  dst/dt  =  0  in  (1).  Indeed,  this  fixed  point  is  easily 
seen  to  take  the  following  explicit  form: 

<A-i 

Si  =  A  d -i  ,  for  i  >  0. 


This  expression  says  that  for  d  >  2,  the  tail  of  the  limiting  stationary  queue  length  distri¬ 
bution  decreases  double  exponentially,  which  is  an  exponential  improvement  over  the  d  =  1 
case,  in  which  the  tail  decreases  geometrically,  with  Si  =  A*. 

Mitzenmacher  goes  on  to  provide  simulations  to  test  the  accuracy  of  the  model.  For 
n=100,  he  focuses  on  the  expected  time  a  customers  spends  in  the  system  by  taking  the 
average  of  10  runs,  each  with  100,000  time  steps  (and  ignoring  the  first  10,000  time  steps), 
for  d  =2,  3,  and  5.  The  following  table  shows  how  these  simulation  results  compare  to  the 
results  predicted  by  the  ODE  approximation.  Note  that  the  predicted  approximation  of 
the  expected  sojourn  time  of  a  customer,  namely,  the  expected  time  a  customer  spends  in 

the  system  (including  in  the  queue  and  in  service)  is  si  =  ^  d_1  • 

Table  1:  Average  Time  in  the  Supermarket  Model 


d 

A 

Simulation 

Prediction 

Rel.  Error(%) 

2 

0.50 

1.2673 

1.2657 

0.1289 

0.70 

1.6202 

1.6145 

0.3571 

0.80 

1.9585 

1.9475 

0.5742 

0.90 

2.6454 

2.6141 

1.1981 

0.95 

3.4610 

3.3830 

2.3028 

0.99 

5.9275 

5.4320 

9.1227 

3 

0.50 

1.1277 

1.1252 

0.2146 

0.70 

1.3634 

1.3568 

0.4858 

0.80 

1.5940 

1.5809 

0.8314 

0.90 

2.0614 

2.0279 

1.6533 

0.95 

2.6137 

2.5351 

3.1002 

0.99 

4.4080 

3.8578 

14.2607 

5 

0.50 

1.0340 

1.0312 

0.2637 

0.70 

1.1766 

1.1681 

0.7250 

0.80 

1.3419 

1.3289 

0.9789 

0.90 

1.6714 

1.6329 

2.3564 

0.95 

2.0730 

1.9888 

4.2363 

0.99 

3.4728 

2.9017 

19.6825 

This  demonstrates  the  accuracy  of  the  ODE  approximation  described  above  for  arrival 
rates  that  are  not  too  close  to  capacity  (i.e.  up  to  95%  capacity). 

In  conclusion,  the  papers  [1]  and  [2]  make  the  powerful  observation  that  increasing 
the  number  of  choices  from  d  =  1  to  d  =  2  brings  an  exponential  improvement  in  the 
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stationary  sojourn  time,  while  increasing  the  number  of  choices  to  d  >  3  adds  only  a  ge¬ 
ometric  improvement  on  top  of  that.  This  result  has  driven  countless  subsequent  studies 
aimed  at  optimizing  the  load  balancing  problem.  In  the  next  section,  we  discuss  various 
modifications  of  the  original  supermarket  model  that  have  been  studied  in  the  literature. 

1.2  Outline  of  the  Thesis 

This  thesis  begins  with  a  comprehensive  survey  of  previously  analyzed  variations  of  the 
classic  supermarket  model  in  Section  2.  We  then  introduce  our  threshold  supermarket 
model  in  Section  3.1  and  explain  our  motivation  for  studying  such  a  system.  The  main 
results  of  our  analysis  can  be  found  in  Section  3.2,  followed  by  our  examination  of  an 
application  of  our  model.  Our  analysis  of  this  application,  in  which  we  explore  the  tradeoffs 
between  performance  and  costs  from  sampling,  can  be  found  in  Section  3.3.  We  also 
present  simulations  in  Section  3.4,  which  show  the  accuracy  of  our  proposed  model.  We 
then  provide  detailed  derivations  of  our  model  and  proofs  of  our  main  results,  which  can  be 
found  in  Section  4.  Finally,  we  introduce  and  analyze  the  associated  threshold  supermarket 
game.  The  description  of  the  game,  along  with  our  main  results  and  their  implications,  can 
be  found  in  Section  5.  A  summary  of  our  findings  and  remaining  open  questions  conclude 
this  thesis  and  can  be  found  in  Section  6. 

2  Literature  Review 

2.1  Modifications  of  the  Classic  Supermarket  Model 

2.1.1  Migration  Costs 

The  original  supermarket  model  assumes  that  any  migration  of  jobs  incurs  no  cost  or 
penalties.  In  their  paper  “Load  Balancing  with  Migration  Penalties”  [3],  Farias,  Moallemi, 
and  Prabhakar  introduce  a  new  system  in  which  there  is  a  cost  for  migration.  They  propose 
that  after  a  customer  arrives  at  a  queue,  the  queue  can  either  decide  to  keep  and  process 
the  task  or  push  the  task  to  another  server,  transforming  this  K  =  1  job  into  K  >  1 
independent  jobs.  In  this  way,  the  system  captures  a  more  realistic  system  in  which  the 
act  of  pushing  a  task  has  a  cost  to  the  overall  throughput. 

Because  transferring  a  task  to  an  alternate  queue  can  now  become  costly,  Farias  et  al. 
propose  two  possible  criteria  for  determining  whether  or  not  a  task  should  be  pushed.  The 
first  criteria  states  that  a  job  is  transferred  only  if  the  sampled  queue  satisfies  an  additive 
threshold  condition  (that  it  has  at  least  K  jobs  less  than  the  original  queue).  The  second 
criteria  states  that  the  job  is  transferred  only  if  the  sampled  queue  satisfies  a  multiplicative 
threshold  condition.  These  threshold  conditions  are  defined  as  follows: 

Additive  Threshold:  qi(t  —  1)  >  qi*(t  —  1)  +  C, 

Multiplicative  Threshold:  q%{t  —  1)  >  ma x(ceqi*(t  —  1),  qi*(t  —  1)  +  K). 
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where  K  is  the  number  of  jobs  sent  to  another  queue  in  migration,  i*  represents  the  index 
of  the  shortest  of  the  d  queues  sampled  with  replacement,  C  >  K,  and  a  >  1. Their  goal  is 
to  identify  simple  migration  policies  that  can  reduce  backlogs,  while  providing  the  highest 
throughput. 

This  model  could  be  of  relevance  in  a  system  where  there  exist  specialists  and  general¬ 
ists.  In  this  scenario,  a  specialist  can  accomplish  a  specific  task  very  quickly.  Unfortunately, 
if  jobs  pile  up  and  the  queue  reaches  a  certain  threshold  length,  it  may  be  forced  to  push  a 
job  to  a  shorter  queue  (that  of  a  generalist)  to  reduce  backlog.  However,  because  the  gen¬ 
eralist  lacks  the  specialist’s  specific  expertise,  the  generalist  will  take  longer  to  process  the 
task.  This  scenario  illustrates  when  a  cost  may  appear  from  migration  within  a  system.  In 
brief,  this  variant  of  the  supermarket  model  captures  the  notion  of  a  preferred  allocation, 
which  is  of  relevance  for  caching  in  multiprocessor  systems  wherein  each  processor  has  a 
preferred  cache  [4], 

After  analyzing  the  associated  ordinary  differential  equations  and  fixed  points  and  sim¬ 
ulations,  Farias  et  al.  found  that  the  additive  threshold  policy  is  stable  if  A  <  but  can  be 
unstable  for  some  A  6  l),  thus  showing  that  this  policy  leads  to  a  loss  in  throughput. 

On  the  other  hand,  they  showed  that  the  multiplicative  threshold  policy  is  always  stable, 
provides  no  loss  of  throughput,  and  provides  improvements  in  average  waiting  time.  This 
implies  that  sharing  can  provide  improvements  over  the  non-sharing  case,  but  we  ask:  is 
there  an  optimal  threshold  to  decide  when  to  share? 

2.1.2  Asymmetric  Models 

Voking  presents  another  interesting  extension  in  his  paper  “How  Asymmetry  Helps  Load 
Balancing”  [5].  He  begins  by  examining  the  effects  of  asymmetry,  non-uniform  sampling, 
and  dependent  sampling  in  a  related  balls  and  bins  model.  In  the  classic  balls  and  bins 
model  analyzed  by  Azar,  Broder,  Karlin,  and  Upfal,  [6],  d  bins  are  sampled  uniformly  and 
independently  at  random,  and  the  ball  is  assigned  to  the  least  loaded  of  the  d  bins,  with 
ties  broken  uniformly  at  random.  Vocking  expands  upon  this  by  looking  at  sampling  non- 
uniformly  and  possibly  dependently  from  the  queues.  To  perform  the  analysis,  he  defines 
three  different  types  of  bin  selection: 

1)  The  classic  model  which  samples  d  locations  uniformly  and 
independently  at  random.  However,  instead  of  ties  being  broken 
randomly  (as  is  the  case  in  [6]),  any  ties  that  may  occur  are  re¬ 
solved  by  the  Always-Go-Left  policy,  which  places  the  ball  in  the 
leftmost  group  of  those  classified  as  the  least-loaded.  This  assyrn- 
metry  is  found  to  be  irrelevant  in  this  uniform  case. 

2)  A  model  in  which  the  d  locations  are  sampled  non-uniformly 
and  independently  at  random.  The  n  bins  are  divided  into  d 
groups  of  equal  size  and  numbered  from  1  to  d.  For  each  ball,  d  of 

8 


12 


the  bins  are  sampled  as  follows:  The  ith  location,  for  1  <  i  <  d, 
is  sampled  uniformly  and  independently  at  random  from  the  ith 
group.  The  ball  is  then  placed  in  the  least  loaded  of  the  d  bins, 
where  any  ties  are  broken  according  to  the  Always-Go-Left  policy. 

This  asymmetry,  unlike  the  uniform  case,  provides  a  dramatic  im¬ 
provement  in  the  non-uniform  case.  It  should  be  noted  however 
that  this  assignment  requires  the  state  of  f, ,  rather  than  just  d, 
queues. 

3)  A  model  where  the  d  bins  are  sampled  non-uniformly  and 
possibly  dependently  at  random,  where  any  ties  are  broken  ac¬ 
cording  to  the  Always-Go-Left  policy. 

To  analyze  the  sequential  allocation  scheme  with  these  three  different  bin  selection 
methods,  Vocking  examined  the  upper  and  lower  bounds  on  the  maximum  load  of  the 
systems.  For  the  case  where  d  =  1,  the  introduction  of  sampling  from  a  non-uniform  dis¬ 
tribution  causes  the  expected  maximum  load  to  worsen.  However,  for  d  >  1,  the  upper 
bound  on  the  maximum  load  in  the  non-uniform  case  (Case  2)  greatly  improves  upon  the 
upper  and  lower  bounds  of  the  uniform  case  (Case  1). 

Uniform  and  Independent  Non-Uniform  and  Independent 

In  In  n/  In  d  ±  0(1)  >  In  In  n/(d  ■  In  <j>d)  +  0(1) 

where  <pd  =  lim*,^  f/Fd(k)  and  Fd(k)  =  0  for  k  <  0,  Fd(  1)  =  1,  and  Fd  is  a  function 
defined  recursively  as  follows:  Fd(k)  =  T,f=1Fd(k  —  i),  for  k  >  2.  This  result  also  shows 
that  the  effect  of  d  on  the  maximum  load  in  the  non-uniform  case  is  linear  (whereas  the 
effect  is  only  logarithmic  in  the  original  uniform  case).  Additionally,  he  found  that  the 
non-uniform  and  dependent  scheme  (Case  3)  leads  to  an  almost  matching  lower  bound, 
proving  that  adding  dependence  produces  no  additional  significant  improvements. 

Vocking  then  went  on  to  analyze  the  case  where  more  balls  exist  than  bins  and  where  an 
infinite  number  of  ball  additions  and  deletions  occur.  He  came  to  the  following  generalized 
theorem:  “Supposing  that  at  most  h  ■  n  balls  exist  at  any  point  in  time.  Then  the  Always- 
Go-Left  algorithm  yields  maximum  load  In  In  n/(d ■  In  <f>d)  +  0(h),”  where  a  ball  is  assumed 
to  exist  if  it  has  entered  the  system  but  has  not  yet  been  deleted  at  time  t. 

This  leads  us  to  ask:  How  does  asymmetry  affect  other  extensions  of  the  load-balancing 
problem?  Does  it  yield  improvements  in  models  with  an  adaptive  d? 

2.1.3  Memory 

Working  with  Prabhakar  and  Shah,  Mitzenmacher  expanded  and  improved  upon  his  orig¬ 
inal  model  in  a  paper  titled  “Load  Balancing  with  Memory”  [7],  published  in  2002.  This 
memory  model  has  the  same  set-up  as  the  supermarket  model,  described  in  the  introduc- 
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tion,  but  instead  of  sampling  d  new  queues  with  each  arrival,  only  d—  1  queues  are  sampled 
and  compared  with  the  memory  of  the  shortest  queue  after  placement  of  the  previous  ar¬ 
rival.  This  can  be  generalized  to  storing  1  <  m  <  d  of  the  least-loaded  queues  from  the 
last  round  of  sampling.  Such  a  system  is  denoted  as  a  (d,m)-memory  policy,  where  d  rep¬ 
resents  the  number  of  new  queues  sampled  uniformly  and  independently  at  random  and 
m  represents  the  number  of  queues  stored  in  memory.  For  simplicity,  Mitzenmacher  et  al. 
chose  to  focus  on  the  case  where  m  =  1.  In  addition,  in  order  to  draw  comparisons  to  the 
commonly  analyzed  case  with  d  =  2  choices,  focus  remained  on  a  system  where  each  new 
arrival  is  routed  to  the  shorter  of  one  random  queue  and  one  queue  from  memory.  This 
scheme  is  labeled  the  (l.l)-memory  policy. 

In  a  previous  paper,  Shah  and  Prabhakar  [8]  found  that  when  analyzing  the  super¬ 
market  model  with  service  rates  /tq,  where  =  n,  and  with  sampling  d  locations  with 
replacement  (but  without  memory),  the  backlog  of  the  system  increases  infinitely.  How¬ 
ever,  by  implementing  the  (l,l)-memory  policy,  the  queue  lengths  remain  bounded.  The 
use  of  memory  provides  the  ability  to  differentiate  slower  servers  from  faster  servers  and 
helps  achieve  stability.  Shah  and  Prabhakar  also  utilized  large  deviations  theory  to  find  a 
bound  for  the  maximum  load  in  the  (d,l)-memory  policy  with  high  probability. 

This  paper  analyzes  the  same  load  balancing  memory  problem  (in  both  the  discrete 
and  continuous  cases)  with  memory  but  finds  exact  bounds.  This  is  obtained  by  applying 
an  extension  of  Kurtz’s  theorem.  The  system  is  modeled  as  two  Markov  chains-  one  for 
memory  and  one  for  servers.  Simulations  are  also  provided,  which  prove  the  accuracy  of 
the  model’s  predictions  but  also  emphasizes  the  importance  of  the  0(1)  constants  to  the 
system’s  actual  behavior.  Average  time  spent  in  the  system  is  the  parameter  of  interest. 

Mitzenmacher  et  al.  found  that  implementing  a  (l,l)-memory  policy  provides  a  dra¬ 
matic  improvement  over  sampling  two  random  queues.  In  fact,  the  bound  of  the  maximum 
load  matches  that  found  by  Vocking’s  asymmetric  Always-Go-Left  scheme,  which  is  as 
follows: 

-  +  0(1),  where  <f>  =  (1  +  \/5)/2,  the  golden  ratio. 

2  In  q) 

Additionally,  Mitzenmacher  et  al.  [7]  examined  the  same  system  with  memory  but 
added  the  component  of  truncation  at  a  threshold  level  L.  They  defined  and  analyzed  two 
different  versions  of  this:  1)  SbL  where  any  arrival  at  a  queue  of  length  at  least  L  is  dropped 
from  the  system  and  2)  S'f  where  any  arrival  at  a  queue  of  length  at  least  L  ignores  memory 
and  resorts  back  to  choosing  d  random  queues.  Surprisingly,  both  of  these  schemes  have 
the  same  limit  and  lead  to  identical  results  as  the  original  (d,l)-memory  system. 

In  practice,  these  results  provide  a  good  approximation  for  systems  with  large  bursts 
of  tasks  from  a  source,  making  the  assumption  of  some  memory  reasonable.  A  common 
application  is  sticky  routing,  in  which  an  efficient  route,  once  identified,  is  used  repeatedly 
until  congestion  occurs.  Can  combining  asymmetry  and  memory  provide  even  further 
improvements? 
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2.1.4  Load  Stealing 

Thus  far,  we  have  only  examined  models  which  incorporate  work  sharing.  However,  a  very 
interesting  topic  analyzes  the  effectiveness  of  load  stealing.  In  such  a  system,  instead  of  a 
queue  passing  a  task  along  to  another  when  it  becomes  “overloaded,”  as  in  work  sharing, 
a  queue  with  room  for  capacity  seeks  out  a  task  from  overloaded  queues.  In  theory,  this 
model  of  load  stealing  is  more  efficient,  because  when  all  processors  are  busy,  no  migration 
attempts  are  made,  reducing  the  system’s  work. 

Mitzenmacher  [9]  analyzed  several  different  versions  of  load  stealing  models  by  allowing 
the  number  of  processors  to  approach  infinity,  developing  families  of  differential  equations 
to  model  the  systems,  and  finding  the  fixed  points  of  the  systems.  These  limiting  models 
are  made  possible  through  Kurtz’s  theorem  as  described  above.  With  each  system,  Mitzen¬ 
macher  also  performed  simulations  to  prove  the  accuracy  of  the  models’  predictions.  These 
predictions  become  more  accurate  (and  analysis  is  more  interesting)  in  cases  where  the 
arrival  rates  are  high.  Therefore,  for  many  of  the  systems  described  below,  Mitzenmacher 
focused  on  arrival  rates  A  =  0.90  and  A  =  0.95.  This  is  interesting  because  it  is  in  contrast 
to  the  “Power  of  Two  Choices”  [2] ,  in  which  the  predictions  become  worse  as  arrival  rates 
increase. 

He  begins  by  analyzing  a  simple  load  stealing  system,  which  has  all  the  properties  of 
the  classic  supermarket  model  with  d  =  1.  Only  now  when  a  queue  becomes  empty,  it  in¬ 
dependently  and  uniformly  at  random  samples  from  the  other  queues  and  “steals”  a  job  if 
the  queue  sampled  contains  more  than  one  job.  This  simple  stealing  system  proves  stable 
and  the  tails  decrease  geometrically  faster  than  the  system  without  stealing.  He  slowly 
added  more  components  to  the  model  to  make  it  more  interesting  and  more  applicable  to 
real-life  settings. 

In  the  next  system  examined,  a  steal  only  occurs  if  the  queue  sampled  has  some  mini¬ 
mum  threshold  T  tasks  (previously  T  =  2),  allowing  stealing  to  be  more  efficient.  Again,  the 
system  proves  to  be  stable  with  the  tails  decreasing  at  a  geometrically  faster  rate.  Adding 
on  this  threshold  stealing,  Mitzenmacher  allowed  for  repeated  steal  attempts.  That  is:  if 
the  first  queue  sampled  did  not  contain  the  minimum  threshold  of  tasks,  another  queue 
would  be  sampled,  and  so  on.  These  stealing  attempts  occur  at  rate  r  per  unit  of  time. 
The  time  between  attempts  is  assumed  to  be  exponentially  distributed.  The  next  system 
modeled  pre-emptive  stealing,  or  the  act  of  stealing  a  job  when  one’s  own  queue  had  at 
most  B  >  1  tasks  left.  Intuitionally,  this  provides  for  a  more  even  queue  distribution. 
It  only  makes  sense  to  allow  a  steal  if  the  queue  sampled  contains  more  than  B  tasks. 
Therefore,  Mitzenmacher  chose  the  threshold  T  >  B  +  2. 

He  then  went  on  to  more  complex  (and  more  realistic)  extensions  of  the  model.  These 
extensions  increased  the  state  space,  making  analysis  much  more  difficult.  The  first  in¬ 
teresting  case  explored  involves  constant  service  times,  as  opposed  to  the  classical  but 
unrealistic  exponentially  distributed  service  times.  Mitzenmacher  chose  the  system  with  a 
sequence  of  c  stages  of  constant  service.  The  random  variable  of  the  constant  service  time 
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of  each  stage  is  independent  and  exponentially  distributed  with  mean  n  =  K  In  this  way, 
as  c  — >  oo,  the  average  time  in  these  c  stages  is  1.  Mitzenmacher  chose  the  case  where 
c  =  20  and  found  that  the  constant  service  times  system  performs  significantly  better  than 
that  of  the  exponential  service  time.  Another  interesting,  yet  complicated  system  is  adding 
a  transfer  delay.  Instead  of  a  transfer  occurring  instantaneously,  as  assumed  before,  trans¬ 
fers  occur  at  rate  r.  The  transfer  delay  is  exponentially  distributed  with  mean  /i  =  He 
applies  this  transfer  delay  to  the  system  where  stealing  occurs  only  when  a  queue  is  empty 
and  the  queue  sampled  has  a  minimum  T  tasks.  A  queue  can  only  steal  one  task  at  a  time. 
In  other  words,  the  queue  cannot  steal  again  when  a  task  is  already  on  the  way.  Because 
arrivals  may  occur  at  a  queue  while  a  steal  is  being  transferred  to  it,  the  threshold  T  must 
be  chosen  carefully  to  ensure  stealing  is  efficient.  Therefore,  to  minimize  the  expected  time 
for  a  task,  the  best  choice  is  T  =  ^  ^ .  Transfer  delays,  even  if  they  are  very  short, 

are  found  to  have  a  very  large  negative  impact  on  performance.  The  significant  effects  are 
quite  surprising. 

Mitzenmacher  then  explored  the  possibility  of  choice  in  stealing,  where  d  queues  are 
sampled  before  a  steal  occurs  from  the  most  loaded  queue,  assuming  at  least  one  meets 
the  minimum  threshold  T.  Although  adding  choice  does  improve  the  system,  the  majority 
of  the  gain  from  stealing  comes  from  d  =  1.  Mitzenmacher  concluded  that  in  real  systems, 
adding  choice  to  stealing  is  not  worthwhile.  Lastly,  he  analyzed  the  case  where  multiple 
steals  can  occur.  Intuitionally,  this  can  be  appropriate  and  reduce  average  time  spent  in 
the  system  if  the  threshold  T  is  high.  Mitzenmacher  considers  the  case  where  k  <  ^  tasks 
are  stolen  at  a  time.  This  and  other  similar  variations  of  k  equalize  the  loads  and  do  in 
fact  improve  performance. 

In  conclusion,  Mitzenmacher  found  that  as  predicted,  load  stealing  models  improve 
performance  (compared  to  those  without).  This  is  evidenced  by  the  fact  that  the  expected 
time  spent  in  the  system  decreases  and  the  tails  decay  at  a  geometrically  faster  rate.  Hav¬ 
ing  d  >  2  choices  when  stealing  also  provides  improvement;  however,  the  majority  of  the 
gains  is  already  seen  when  d  =  1.  Additionally,  adding  a  transfer  delay  with  rate  r  greatly 
hurts  the  system’s  performance,  even  when  delays  are  small.  Mitzenmacher  also  proposes 
other  interesting  load  stealing  systems  to  analyze.  The  first  is  allowing  the  processors  to 
have  heterogeneous  service  times.  He  describes  a  case  where  fast  and  slow  processors  can 
be  distinguished  between  when  making  choices.  Another  interesting  system  he  describes 
includes  differentiating  between  internal  and  external  arrival  rates,  where  internal  arrivals 
refer  to  jobs  “spawned”  by  tasks  already  at  a  processor.  He  also  expresses  his  interest 
in  developing  a  general  framework  that  will  help  with  the  complex  convergence  issues  he 
faced  in  many  of  these  more  complicated  load  stealing  models. 

2.1.5  Pull  vs.  Push 

Minnebo  and  Van  Houdt  recently  published  a  paper  titled  “A  Fair  Comparison  of  Pull 
and  Push  Strategies  in  Large  Distributed  Networks”  [10],  in  which  they  further  explore  the 
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advantages  and  disadvantages  of  pull  (load  stealing)  and  push  (load  sharing)  strategies. 
By  allowing  the  number  of  servers  to  approach  infinity,  they  compared  the  average  time 
spent  in  the  system  under  pure  push  strategies,  pure  pull  strategies,  and  hybrid  strategies. 
The  system  analyzed  is  slightly  different  than  the  one  examined  by  Mitzenmacher.  Instead 
of  attempting  a  push/pull  after  the  completion  or  arrival  of  a  task,  an  idle  queue  in  a  pull 
system  makes  attempts  at  some  rate  r  and  a  queue  with  jobs  waiting  in  a  push  system  make 
attempts  at  some  rate  r.  They  introduce  a  new  and  unique  tool  of  analysis:  normalizing 
the  different  strategies  by  setting  the  overall  probe  rate  R  of  each  equal.  Here  R  is  defined 
as  the  average  number  of  push/pull  attempts  sent  by  a  queue  per  unit  of  time.  Given  the 
maximum  overall  probe  rate  R  and  arrival  rate  A,  Minnebo  and  Van  Houdt  found  that 
push  strategies  actually  perform  better  than  pull  strategies  for  R  >  0  and  A  <  <f>—  1 ,  where 
4>  =  1 +2V^,  the  golden  ratio.  The  push  strategy  performs  better  than  the  pull  strategy  if 
and  only  if  2A  <  y/(R  +  l)2  +  4 (R  +  1)  —  (R+  1).  Additionally,  hybrid  strategies  always 
perform  worse  than  either  a  pure  pull  or  pure  push  strategy.  This  analysis,  however,  only 
examines  the  case  where  push  attempts  begin  when  there  is  at  least  one  task  in  the  queue, 
and  pull  attempts  begin  only  when  a  queue  becomes  idle.  Additionally,  a  queue  only 
accepts  a  job  if  it  is  currently  idle.  There  are  many  natural  extensions  of  this  still  to  be 
explored,  such  as  attempting  to  push  only  when  some  fixed  minimum  threshold  T  >  1  is 
reached,  or  preemptively  attempting  to  pull,  or  accepting  a  job  at  some  threshold  T  >  0. 
Minnebo  and  Van  Houdt  presented  some  interesting  new  parameters  to  focus  on  when 
comparing  push  and  pull  systems,  but  left  much  room  for  further  research. 

2.1.6  Centralization 

Another  interesting  topic  to  consider  is  centralization.  In  a  recent  paper  titled  “On  the 
Power  of  (even  a  little)  Centralization  in  Distributed  Processing”  [11],  Tsitsiklis  and  Xu 
examine  the  effectiveness  of  centralization.  The  focus  remains  on  systems  with  many  local 
stations,  which  can  only  serve  tasks  addressed  specifically  to  them.  They  add  the  idea 
of  sending  a  fraction  p  of  resources  to  serve  in  a  centralized  manner,  such  as  serving  the 
most-loaded  station.  They  describe  two  separate  applications  of  interest. 

The  first  is  a  group  of  local  stations  which  are  connected  to  a  centralized  processor  that 
can  serve  the  station  with  the  largest  queue  whenever  possible.  A  key  question  is:  what  is 
the  optimal  fraction  p  of  resources  to  devote  to  the  central  server?  Because  local  servers 
cannot  transfer  tasks  between  each  other,  it  logically  makes  sense  to  give  more  resources  to 
a  central  server.  Doing  this  would  prevent  one  station  from  being  empty  while  another  has 
a  large  and  growing  queue.  However,  this  would  require  much  more  communication.  Do 
the  performance  improvements  outweigh  the  overhead  costs?  They  also  wonder  if  adding 
a  little  centralization  is  any  different  than  none. 

The  second  application  is  a  group  of  local  stations,  which  are  all  serviced  by  one  central 
server.  Optimally,  the  server  pulls  from  the  station  with  the  greatest  load.  However,  these 
communications  can  be  very  costly  and  in  some  cases,  not  possible.  In  such  scenarios,  a 
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task  is  pulled  uniformly  at  random  from  the  local  stations.  If  the  station  sampled  is  empty, 
it  is  a  waste  of  that  turn.  What  is  the  optimal  level  of  communication  to  make  this  process 
efficient? 

Both  of  the  described  applications  can  be  modeled  by  the  same  mathematical  structure 
and  are  addressed  together  in  the  paper.  The  major  result,  found  through  the  mean  field 
approach,  states  that  adding  (any)  centralization  (p  >  0)  allows  the  average  queue  length 
to  scale  as  log  1  which  is  exponentially  smaller  than  the  classic  case  with  scaling  of 

A  small  amount  of  centralization  has  a  large  impact  on  performance. 

An  interesting  extension  of  this  paper  would  be  to  look  at  centralization  with  transfer 
delays/costs.  This  would  be  much  more  realistic  as  the  applications  of  the  local  stations 
most  often  occur  with  stations  set  apart  by  geography.  If  this  were  the  case,  the  time  costs 
of  transferring  tasks  would  presumably  have  a  large  impact.  Additionally,  this  paper  also 
leaves  much  room  for  examining  centralization  in  systems  with  general  service  distributions 
and  disciplines. 

2.1.7  Cost-Aware  Monitoring 

The  majority  of  the  models  discussed  do  not  consider  a  significant  cost  from  monitoring 
and  information  gathering  in  the  migration  of  jobs.  Breitgand,  Cohen,  Nahir,  and  Raz  [12] 
develop  several  self-adaptive  algorithms,  for  both  centralized  and  fully  distributed  systems, 
which  model  the  trade-off  between  the  gains  from  monitoring  and  the  costs  of  obtaining 
information  about  the  system.  They  define  this  new  system,  aimed  at  maximizing  utility 
for  management,  as  the  Extended  Supermarket  Model  (ESM).  They  have  demonstrated  the 
accuracy  of  their  model  and  predictions  via  both  simulations  and  a  real  testbed.  Through 
analyzing  the  system,  they  have  found  the  optimal  number  of  servers  to  monitor  in  order 
to  achieve  the  minimum  average  time  spent  in  the  system  with  the  optimal  cost  at  each 
service  rate.  Their  cost-aware  model  has  proven  to  provide  dramatic  improvements  over 
the  cost-oblivious  counter-part. 

They  describe  a  system,  which  has  costs  when  a  load  request  is  received  either  from 
other  servers  (when  completing  the  initial  d  sampling  upon  an  arrival)  or  a  monitor.  Load 
requests  are  immediately  responded  to,  therefore  taking  time  away  from  actual  service 
and  affecting  service  time.  This  time  includes  what  it  takes  to  receive  the  request,  parse 
it,  retrieve  the  necessary  information,  and  respond.  Let  m  represent  the  mean  response 
time  to  a  monitoring  request.  Then  C  =  -  =  -  represents  the  impact  on  service  time. 
Logically,  this  cannot  be  viewed  as  negligible  and  should  be  considered  in  load  balancing 
problems.  Because  the  mean  service  time  is  normalized  to  1,  the  effective  service  rate  must 
be  smaller.  Because  each  new  arrival  creates  d  load  queries,  the  effective  service  rate  is: 

n'  =  1  -  A  •  d  ■  C. 

Subsequently,  let  p  =  -4  be  the  arrival  rate  adjusted  for  the  effective  service  rate.  Therefore, 
it  must  be  the  case  that  A  <  This  shows  right  away  that  when  the  load  is  high, 
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monitoring  can  be  unwise  and  cause  an  unstable  system.  However,  when  the  load  is  low 
enough,  monitoring  can  be  a  very  powerful  tool. 

Breitgand  at  el.  present  two  different  forms  of  a  centralized  system.  The  first  utilizes 
periodic  updates,  which  are  stored  in  a  centralized  database.  When  new  arrivals  occur, 
the  information  on  the  d  queues  sampled  is  looked  up  on  this  central  database.  Q  updates 
occur  per  unit  of  time.  Checking  more  often  provides  greater  accuracy  of  information 
in  the  database  but  increases  communication  costs  greatly.  The  second  form  is  per-job 
polling.  When  examining  the  centralized  case,  Breitgand  et  al.  found  that  contrary  to 
popular  thought,  there  is  no  significant  difference  between  periodic  updates  and  per-task 
polling  when  the  optimal  d  is  chosen.  Although  it  is  true  that  when  d  is  large,  per-job 
polling  greatly  outperforms  periodic  updates.  In  the  adaptive  algorithm,  a  centralized 
router  dynamically  estimates  the  load  (A  •  n)  and  the  overhead  efficiency  ratio  C.  With 
this  information,  an  optimal  d  is  pulled  from  a  look-up  table.  Subsequently,  the  task  is 
sent  to  the  least-loaded  of  the  d  queues  sampled. 

For  the  fully-distributed  case,  two  different  main  algorithms  were  developed.  The  first 
relied  on  a  pre-determined  look-up  table,  as  in  the  centralized  case.  This  look-up  table 
indicates  the  optimal  d  based  on  dynamic  estimates  of  the  current  load  (A -n)  and  the  current 
overhead  efficiency  ratio  C.  The  alternate  algorithm  consists  of  each  server  dynamically 
performing  a  cost-benefit  analysis  of  forwarding  the  job  received.  If  the  cost  of  forwarding 
is  determined  to  be  greater  than  the  expected  gains,  the  task  will  be  routed  to  the  least- 
loaded  of  the  queues  previously  sampled.  Forwarding  a  task  now  costs  the  time  it  takes  to 
transfer  the  task  ( CT )  and  the  CPU  cost  of  analyzing  the  effectiveness  of  forwarding.  These 
algorithms  provide  a  great  framework  for  explicitly  analyzing  systems  with  non-negligible 
communication  and  decision-making  costs. 

The  opportunities  to  expand  upon  this  research  and  apply  other  forms  of  costs  to  the 
load  balancing  problem  are  numerous  and  fascinating.  The  Extended  Supermarket  Model 
provides  an  effective  way  of  quantifying  costs  of  monitoring  and  greatly  enhances  ability 
to  analyze  systems  with  non-negligible  costs.  This  concept  of  explicitly  evaluating  cost  in 
models  is  intriguing  and  very  applicable  to  similar  game  theory  related  applications  of  the 
supermarket  model. 

2.2  Other  Generalizations 

The  analysis  of  such  randomized  load  balancing  algorithms  is  still  a  very  active  area  of 
current  research.  Recent  works  have  considered  other  disciplines  besides  FIFO,  such  as 
last-in  first-out  preemptive  resume  (LIFO-PR)  and  process-sharing  (PS)  [13]  and  [14],  as 
well  as  more  general  service  distributions  [15],  and  many  open  problems  remain. 
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3  The  Threshold  Supermarket  Model 

3.1  Description  of  the  Model 

Recall  that  the  supermarket  model  describes  a  system  in  which  customers  arrive  to  a 
service  center  with  n  parallel  queues  according  to  a  Poisson  process  with  rate  A n,  where 
A  <  1.  Each  queue  has  a  dedicated  server  that  processes  tasks  according  to  a  FIFO  service 
discipline.  Service  times  are  i.i.d.  exponentially  distributed  with  mean  fi  =  1.  After 
sampling  d  of  the  n  queues  independently  and  uniformly  at  random  (with  replacement) 
upon  arrival,  each  customer  joins  the  shortest  queue,  with  any  ties  being  broken  uniformly 
at  random. 

In  this  thesis,  we  analyze  a  variant  of  the  supermarket  model  in  which  the  number  of 
queues  sampled  is  determined  according  to  the  following  threshold  function: 

f 1  if  q  <  T, 

#  of  queues  sampled  =  <  (2) 

Id  if  q  >  T, 

where  q  is  defined  to  be  the  observed  length  of  the  initial  queue  sampled.  This  thesis 
focuses  on  the  case  where  d  =  2. 

Note  that  when  T  =  0,  our  model  reduces  to  the  supermarket  model  of  [1]  or  Mitzen- 
macher’s  [2]  “The  Power  of  Two  Choices,”  and  the  number  of  queues  sampled  becomes  a 
fixed  constant  d.  In  such  a  case,  it  is  well  known  that  increasing  the  number  of  queues 
polled  from  d  =  1  to  just  d  =  2  improves  waiting  time  exponentially  and  captures  the 
majority  of  the  gain  from  allowing  d  >  2.  However,  sampling  d  =  2  queues  with  every 
arrival  can  be  very  costly  to  the  system.  So  a  logical  question  to  ask  is:  Does  the  system 
require  d  =  2  queues  to  be  sampled  for  every  customer,  or  are  there  some  circumstances  in 
which  d  =  1  is  sufficient?  If  the  length  of  the  first  queue  sampled  is  “small  enough,”  can  the 
extra  sampling  be  eliminated  without  either  hurting  the  total  throughput  of  the  system  or 
leading  to  a  significant  deterioration  in  the  expected  steady  state  sojourn  time?  If  so,  what 
threshold  is  appropriate?  These  questions  led  to  the  development  of  our  threshold  model. 
In  this  thesis,  we  seek  to  answer  the  question:  Given  a  marginal  cost  for  sampling,  does 
there  exist  an  optimal  threshold  T*  >  0  which  minimizes  the  overall  cost  of  the  algorithm, 
measured  as  the  sum  of  the  expected  waiting  time  and  total  cost  of  sampling? 

The  proposed  system  is  designed  as  follows.  Customers  arrive  according  to  Poisson  pro¬ 
cess  with  rate  An,  where  A  <  1.  Upon  arrival,  each  customer  samples  one  queue  uniformly 
at  random.  If  the  queue  length  exceeds  or  is  equal  to  some  threshold  T,  the  customer 
samples  an  additional  queue  and  joins  the  shorter  of  the  d  =  2  queues  sampled.  However, 
if  the  initial  queue  length  observed  is  below  the  threshold  T,  the  customer  simply  joins 
this  queue,  eliminating  the  cost  that  would  have  resulted  if  an  additional  queue  had  been 
sampled.  The  logic  behind  such  a  system  is  as  follows:  when  a  queue  is  short  enough  (less 
than  T),  the  benefit  of  sampling  an  additional  queue  may  be  outweighed  by  the  cost  of 
extra  polling.  Therefore,  by  finding  the  optimal  threshold  T* ,  the  system  will  effectively 
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be  eliminating  unnecessary  search  costs  and  optimizing  the  efficiency  of  the  system. 

We  now  introduce  some  common  notation  that  will  be  used  throughout  our  analysis.  So 
far  in  our  analysis,  we  have  already  assigned  n  to  be  the  number  of  servers  in  our  system, 
d  to  be  the  number  of  queues  sampled  by  each  customer,  and  T  to  be  the  threshold  value 
that  determines  d.  Additionally,  we  define  m\n\t.)  to  be  the  number  of  queues  with  at 

(t)  =  — ^ — -  to  be  the  fraction  of  queues  with  at  least 


(n) 


least  i  customers  at  time  t,  and  s4  w 

i  customers  at  time  t.  For  convenience,  the  s[n\t)  notation  will  be  used,  and  the  reference 
to  time  t  will  be  excluded  when  the  meaning  is  clear. 


3.2  Main  Results 

Define  S  to  be  the  space  of  sequences  x  =  {xj}^0  £  [0, 1]°°,  such  that: 

Xq  =  1  >  X\  >  .  .  .  Xk  >  .  .  • 

and  equip  S  with  the  metric: 

/  t\  \Xi  —  Xi\  /  _  ~n 

p(x,  x  )  =  sup - : - ,  x,x  Go. 

j>0  * 


Let  S  be  the  subspace  of  the  metric  space  S  defined  by: 

oo 

S  =  {s  £  S  :  ^2  Si  <  oo}. 

i=  1 


Also,  let  Sn  =  S  n  -Zn.  When  the  number  of  servers  is  n,  we  shall  encode  the  state 
of  the  system  in  the  threshold  model  in  the  countable  vector- valued  process  s^n\t)  = 
(s2\t),  s[n\t),  . . . ,  s2\t),  . . .).  Note  that  for  each  n  £  N  and  t  >  0,  s^n\t)  lies  in  Sn. 

Theorem  1.  For  each  n  £  N,  {s^n\t),  t  >  0}  is  a  jump  Markov  process  that  is  ergodic 
and  has  a  unique  stationary  distribution  ir ^ . 


Our  goal  is  to  find  an  approximation  for  this  unique  stationary  distribution  tt .  To  do 
so,  we  first  show  that  for  large  n,  the  evolution  of  s^  can  be  approximated  by  the  unique 
solution  s  of  the  following  coupled  system  of  ODEs:  for  t  >  0, 


dSi(f]  =  /A(Sj_l(t)  -  Si(t))(  1  +  ST(t))  -  ( Si(t )  -  si+i(t)) 
dt  \A(s?_1(t)  -  sf(t))  -  ( Si(t )  -  si+1(t)) 

so  =  1, 

si  =  A, 


with  appropriate  initial  condition  s(0)  £  S. 


ifT  >  i  -  1, 
if  T  <  i  —  1, 


(3) 
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Note  that  when  T  =  0,  this  reduces  to  the  system  of  ODEs  (1)  associated  with  the  super¬ 
market  model. 

(See  Section  4.1  for  full  derivation.) 

Upon  analyzing  this  system,  we  come  to  our  next  result. 

Theorem  2.  Given  any  so  G  S,  the  system  of  ODEs  has  a  unique  solution  s(t)  = 
{si(t)ifL0,t  >  0},  with  initial  condition  s(0).  Moreover,  if  s( 0)  G  S,  then  s(t)  G  S  for 
all  t  >  0.  For  any  sequence  sln)(0)  G  Sn ,  n  G  N,  such  that 

lim  p(s^n\0),  s(0))  =  0, 

n— >oc 


for  every  R  <  oo,  we  have 

lim  sup  p(s^n\t),s(t))  =  0. 
n-s-0°  te[o,R] 


(See  Section  4.1  for  detailed  proof.) 


We  now  study  fixed  points  of  the  system  of  ODEs  for  the  threshold  supermarket  model, 
which  are  obtained  by  setting  dsi/dt  =  0  in  (3).  These  lead  to  algebraic  equations  which, 
unlike  in  the  case  of  the  original  supermarket  model,  lead  to  an  implicit  recursion  due  to 
the  presence  of  st  in  the  equations  for  st  for  i  <  T.  The  full  details  of  this  derivation  are 
given  in  Section  4.2.  In  a  nutshell,  each  term  s*  in  the  solution  of  the  recursion  that  de¬ 
scribes  the  fixed  point  can  be  expressed  in  terms  of  (the  a  priori  unknown)  st,  which  itself 
is  then  characterized  as  the  stable  root  of  a  certain  function  H ,  which  we  now  describe. 
The  function  H  is  of  the  form  H  =  H\  —  H2,  where 


Fi(st) 

H2{st) 


(1  +  st)t, 
sT  — 

AT(1  -  A)' 


(4) 


Definition  1.  We  will  say  s*  is  a  root  of  H  if  H(s*)  =  0.  Moreover,  a  root  s*  is  said  to 
be  stable  if  H'(s%)  <  0. 

Lemma  1.  If  A  <  b  H  has  a  unique  root  in  [0,  1].  If  A  >b  H  has  two  roots  in  [0,  1],  of 
which  only  one  is  stable. 

(See  Section  4.2  for  proof.) 
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Remark  1.  By  Lemma  1,  we  know  that  H  has  a  unique  stable  root,  which  we  will  denote 
by  s*. 

Table  2  summarizes  the  numerically  computed  values  of  4  for  different  values  of  A  and  T. 


Table  2:  4 


A 

Threshold 

Sjn 

0.40 

2 

0.1290 

3 

0.0446 

4 

0.0165 

5 

0.0064 

0.50 

2 

0.2000 

3 

0.0828 

4 

0.0368 

5 

0.0172 

0.70 

2 

0.4050 

3 

0.2232 

4 

0.1282 

5 

0.0774 

0.80 

2 

0.5517 

3 

0.3505 

4 

0.2243 

5 

0.1490 

0.90 

2 

0.7431 

3 

0.5618 

4 

0.4087 

5 

0.2994 

0.95 

2 

0.8616 

3 

0.7323 

4 

0.5887 

5 

0.4631 

We  are  now  ready  to  state  the  main  result  for  the  threshold  model. 

Theorem  3.  As  n  — >•  oo,  the  stationary  distribution  it (n')  =  (i r^)“0  of  s^  converges  to 
7r  €  S,  where  7Tq  =  0,  tt\  =  A,  and 


TU  =  < 


1  -  A 


1  —  A(1  +  4) 
*  \2i— T  — 1 


va+4r  + 


A  -  A(1  +  4) 
1  —  A(1  +  4) 


l  4(A4) 


for  2  <  i  <  T  +  1, 


for  i  >  T  +  1, 


(5) 


where  4  the  unique  stable  fixed  point  of  H . 

(See  Section  4.2  for  detailed  proof  and  derivation.) 


3.3  Finding  the  Socially  Optimal  Threshold:  Algorithm  Numerics 

We  now  examine  an  application  of  the  main  result:  analyzing  the  tradeoffs  in  performance 
associated  with  eliminating  the  extra  cost  of  unnecessary  polling.  We  seek  to  identify  the 
optimal  threshold  that  minimizes  the  cost  per  customer,  where  given  a  threshold  T,  arrival 
rate  A,  and  a  unit  cost  of  sampling  cs,  the  cost  is  defined  to  be  the  sum  of  the  limiting 
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average  size  of  the  queue  length  and  the  total  sampling  cost.  Thus,  the  expected  total  cost 
of  each  customer  can  be  represented  by  the  following  equation: 


Cost(T )  =  E[queue  length]  +  csE[number  of  extra  searches], 

OO 

=  Si  +  CsST. 
i=  1 


(6) 


The  optimal  threshold  T*  can  be  found  by  minimizing  this  cost  function  with  respect 
to  T  (that  is,  T*  =  argminTCost(T)).  Given  a  A  and  the  cost  of  sampling,  numerically 
solving  for  T*  is  trivial. 

The  following  thresholds  were  found  to  be  socially  optimal  given  various  values  for  A 
and  sampling  cost  cs: 


Table  3:  Optimal  Threshold 


A 

cs 

T* 

0.40 

0.1 

1 

0.5 

2 

1.0 

3 

1.5 

4 

2.0 

5 

0.50 

0.1 

1 

0.5 

2 

1.0 

3 

1.5 

3 

2.0 

4 

0.70 

0.1 

1 

0.5 

2 

1.0 

3 

1.5 

3 

2.0 

4 

0.80 

0.1 

1 

0.5 

2 

1.0 

3 

1.5 

3 

2.0 

4 

0.90 

0.1 

1 

0.5 

2 

1.0 

3 

1.5 

3 

2.0 

4 

0.95 

0.1 

1 

0.5 

2 

1.0 

3 

1.5 

3 

2.0 

4 

These  results  indicate  that  for  a  sampling  cost  cs  >  0,  it  is  always  more  efficient  to 
implement  a  threshold  T  >  0.  In  other  words,  the  performance  of  our  threshold  model 
always  performs  better  than  the  classic  “Power  of  Two  Choices”  [2]  case  where  T  =  0. 
Additionally,  these  results  show  that  as  the  marginal  cost  of  sampling  increases,  the  ben¬ 
efit  of  implementing  a  threshold  model  increases  with  it  and  becomes  quite  large.  The 
following  figures  compare  the  average  cost  per  customer  in  the  case  without  a  threshold 
(when  T  =  0)  to  our  case  where  an  optimal  threshold  T*  is  used  to  determine  the  number 
of  queues  sampled.  As  can  be  seen,  for  all  A  and  cost  structures,  our  threshold  model 
results  in  a  lower  average  cost  per  customer  and  therefore,  a  more  efficient  system. 
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Cost  of  Sampling 


Figure  3:  Detailed  Comparison  for  A  =  0.80 


(d)  A  =  0.80  (e)  A  =  0.90  (f)  A  =  0.95 

Figure  4:  General  Comparison 
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The  following  figures  show  how  the  gains  from  using  the  optimal  threshold  T*  scale  as 
we  approach  heavy  traffic.  As  can  be  seen,  as  arrival  rates  increase,  the  gains  from  using 
the  optimal  threshold  T*  decrease  (but  do  not  disappear).  This  intuitively  makes  sense. 
For  a  fixed  threshold  T* ,  if  one  increases  A,  then  the  probability  of  finding  a  queue  length 
above  T*  increases,  and  thus  the  model  behaves  more  like  the  supermarket  model.  This 
suggests  that  it  may  be  appropriate  to  scale  T*  in  an  appropriate  way  with  A. 
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Arrival  Rate 

Figure  5:  Percentage  Gain  for  cs  =  1 
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3.4  Simulations 

We  simultaneously  ran  simulations  of  our  threshold  supermarket  model  with  n  =  100. 
Again,  we  focused  on  the  expected  total  cost  of  each  customer  as  a  proxy  for  system  per¬ 
formance.  The  following  results  are  achieved  by  taking  the  average  of  ten  trials,  where 
each  trial  looks  at  time  steps  10,000  to  100,000.  The  first  9,999  steps  are  not  considered 
in  order  to  let  the  system  reach  equilibrium.  The  following  table  shows  how  remarkably 
close  the  simulations  are  to  predictions. 


Table  4:  Simulations:  Total  Cost  Comparisons 


A 

cs 

T* 

Simulation  C(T*) 

Prediction  C(T*) 

Rel.Error{% ) 

0.40 

0.1 

1 

0.5085 

0.5056 

0.5736 

0.5 

2 

0.6013 

0.6002 

0.1833 

1.0 

3 

0.6380 

0.6392 

0.1877 

1.5 

3 

0.6495 

0.6543 

0.7336 

2.0 

5 

0.6528 

0.6609 

1.2256 

0.50 

0.1 

1 

0.6857 

0.6828 

0.4247 

0.5 

2 

0.8201 

0.8202 

0.0122 

1.0 

3 

0.9002 

0.8983 

0.2115 

1.5 

3 

0.9344 

0.9396 

0.5534 

2.0 

4 

0.9504 

0.9583 

0.8244 

0.70 

0.1 

1 

1.2046 

1.2001 

0.3750 

0.5 

2 

1.4388 

1.4315 

0.5100 

1.0 

3 

1.6277 

1.6252 

0.1538 

1.5 

3 

1.7496 

1.7368 

0.7370 

2.0 

4 

1.8355 

1.8352 

0.0163 

0.80 

0.1 

1 

1.6438 

1.6379 

0.3602 

0.5 

2 

1.9309 

1.9203 

0.5520 

1.0 

3 

2.2026 

2.1909 

0.5340 

1.5 

3 

2.3979 

2.3662 

1.3397 

2.0 

4 

2.5264 

2.5309 

0.1778 

0.90 

0.1 

1 

2.4245 

2.4427 

0.7451 

0.5 

2 

2.7620 

2.7803 

0.6582 

1.0 

3 

3.1276 

3.1446 

0.5406 

1.5 

3 

3.4208 

3.4256 

0.1401 

2.0 

4 

3.6505 

3.6828 

0.8771 

0.95 

0.1 

1 

3.2305 

3.3089 

2.3694 

0.5 

2 

3.5982 

3.6765 

2.1297 

1.0 

3 

4.0341 

4.0993 

1.5905 

1.5 

3 

4.4201 

4.4654 

1.0145 

2.0 

4 

4.7527 

4.7938 

0.8574 

We  also  performed  the  same  analysis  (comparing  the  results  from  T  =  T*  to  T  =  0)  as 
was  done  using  the  numerics  in  Section  3.3.  Figures  7  and  8  depict  the  results. 
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Cost  of  Sampling 

Figure  7:  Simulations  Comparison  for  A  =  0.80 


(a)  A  =  040  (b)  A  =  0.50  (c)  A  =  0.70 


(d)  A  =  0.80  (e)  A  =  0.90  (f)  A  =  0.95 

Figure  8:  General  Simulations  Comparison 

As  can  be  seen,  these  results  mirror  those  found  from  the  numerics.  That  is,  for  any 
cost  greater  than  zero,  our  threshold  model  always  outperforms  the  parallel  model  without 
a  threshold.  Again,  as  the  cost  of  sampling  increases,  the  benefits  from  using  a  threshold 
increase  as  well. 
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4  Proofs  of  Theorems 

4.1  Convergence  to  the  System  of  Ordinary  Differential  Equations 

Recall  the  state  process  s ^  lies  in  Sn  and  was  defined  in  Section  3.2.  To  repeat:  s[n\t)  is 
the  random  variable  representing  the  fraction  of  queues  in  the  n-server  system  that  have 
i  or  more  packets  at  time  t.  Given  our  assumptions  of  Poisson  arrivals  and  exponential 
service  times,  it  follows  that  s ^  is  a  jump  Markov  process  with  jump  directions  lying  in 
V/n,  where  V  =  {Te*,  i  >  1,  . . . ,  k,  . . .}  and  e*  is  defined  to  be  the  unit  vector  in  [0,  1]°° 
that  has  a  1  at  the  ith  coordinate  and  0  otherwise.  For  s  £  Sn ,  the  transition  s  — »•  s  +  ^ 
represents  the  event  that  an  arriving  job  joins  a  queue  with  i  —  1  jobs. 

Suppose  1  <  i  —  1  <  T.  Then,  this  transition  can  happen  in  two  ways.  The  first  way 
is  if  there  is  a  new  arrival  and  the  first  queue  sampled  has  i  —  1  jobs.  Since  i  —  1  <  T,  this 
new  arrival  will  then  join  this  queue.  When  the  system  is  in  state  s  £  Sn,  the  fraction  of 
queues  with  exactly  i  —  1  jobs  is  (sj_i  —  Sj),  and  so  this  event  occurs  at  rate  nA(.Sj_i  —  Sj). 
The  second  way  the  transition  S{  — >  Si  +  —  can  happen  is  if  there  is  a  new  arrival  and  the 
first  queue  sampled  has  T  or  more  jobs  (which  happens  with  probability  sy).  In  this  case, 
the  policy  dictates  that  a  second  queue  should  also  be  sampled,  and  the  probability  that 
the  second  queue  sampled  has  i  —  1  jobs  is  again  (sj_i  —  Sj).  Moreover,  since  i  —  1  <  T, 
the  arriving  job,  in  this  case,  will  join  the  queue  with  i  —  1  packets,  thus  increasing  the 
number  of  queues  with  i  packets. 

Now  consider  the  case  when  i  —  1  >  T.  In  this  case,  as  in  the  original  supermarket 
model,  the  transition  s  — >  s  +  ^  occurs  if  and  only  if  an  arrival  samples  two  queues,  and 
the  smaller  of  the  two  queue  lengths  has  i  —  1  jobs.  This  happens  at  rate  nA(s?_1  —  s|). 

Finally,  the  transition  s  — >  s  —  ^  occurs  due  to  departures  from  a  queue  with  i  jobs. 
Since  the  service  rate  is  //  =  1,  this  happens  at  a  rate  proportional  to  the  number  of  queues 
with  i  jobs,  which  is  nA(sj  —  Sj+i)- 

Putting  this  all  together,  we  see  that  s ^  is  a  jump  Markov  process  with  the  following 
jump  directions  and  rates: 


Jump  Direction 

Jump  Rate 

+  — ,  if  1  <  *  <  T  +  1, 
n 

n\(si-i  -  Sj)(l  +  sT 

H — if  T  +  1  <  i, 
n 

nA(sf_1  -  sf), 

- -,  if  1  <  i, 

n 

n\(si  -  Sj+i). 
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The  generator  Cn  of  the  Markov  process  s thus  takes  the  form: 


i 

Cnf(s)  =  nV  -  Si)(l  +  sT)  \f(s  +  — )  -  f(s) ]  -  X(si  -  si+1)  |/(s  -  — )  -  /(s)l) 

V  LnJ  LnJ/ 


i=l 

oo 


+  «  (A(s?~ i-»i)  /(«  +  -)-/(»)  -A(si-si+i)  f(si-  -  f(s ) 

'  V  LnJ  L  n 

i=T+l 


for  all  bounded  functions  /  :  5"  — >  M  and  s  £  5”. 

Proof  of  Theorem  1.  Theorem  1  can  be  established  using  methods  exactly  analogous  to 
those  used  by  Vvedenskaya,  Dobrushin,  and  Karpelevich  in  Theorem  5  of  their  paper  [1], 
We  do  not  provide  the  details  here. 

In  order  to  prove  Theorem  2,  we  first  state  Kurtz’s  theorem. 

Theorem  4.  (Kurtz’s  Theorem):  Suppose  that  maxi\ei\  =  e  £  (0,  oo)  and 

max  sup  rt  (x)  <  oo. 

*  x€S 


Further,  suppose  we  have  a  density  dependent  family  satisfying  the  Lipschitz  condition 


I F(x)-F(y)\  <  K\x  —  y\, 


for  some  constant  K.  Suppose  limn->0 oX(0)  =  xq  and  let  X  be  the  deterministic  process: 


X(t) 


XQ  + 


F(X(u))du ,  t  >  0. 


Then, 

lim  sup|Vn(n)  —  X(u) \  =  0  a.s. 

n^°°  u<t 

As  stated,  Kurtz’s  theorem  does  not  directly  apply  here  because  we  have  a  countable 
(and  not  a  finite)  number  of  jump  directions.  However,  it  turns  out  that  we  can  extend 
the  result  to  our  problem  using  methods  similar  to  those  used  in  [1]  and  [2],  and  with  the 
Euclidean  metric  replaced  by  the  metric  p  introduced  in  Section  3.2.  Below,  we  ignore  the 
finiteness  constraint,  and  verify  the  conditions  of  Kurtz’s  theorem. 


Proof  of  Theorem  2.  We  must: 
i)  Prove  that:  maxi\ei\  =  e  £  (0,  oo). 

Recall  that  the  possible  transitions  are  in  V/n,  where  V  =  {±ej  :  i  >  1}  and  e*  is 
defined  to  be  the  standard  unit  vector.  It  is  clear  that  maxi\ei\  =  ^  £  (0,  oo). 
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ii)  Prove  that:  max*  sup^.^  ri{x)  <  oo. 

We  set  r+j  and  r_j  to  be  the  arrival  rate  and  departure  rate,  respectively.  Recall  that 
these  rates  are  defined  in  Section  4.1.  Because  0  <  Si  <  1  for  all  i,  maxr-i  =  1.  Addition¬ 
ally,  max  r+i  =  A(l)[l/T>j_n(2)  +  1{t<i~i}(2)]  =  2A.  Therefore,  max*  supx6<5  r* [x)  = 
max{  1,  2 A}  <  oo. 

iii)  Prove  Lipschitz  Condition  holds:  | F(x)  —  F(y)  \  <  K\x  —  y\  for  some  constant  K. 

OO 

F(x)  =  A(sj_l  —  Sj)[l{x>j_i}(l  +  ST)  +  l{T<i-l}(si-l  +  si)]  —  (si  —  Si+l). 

1=1 

Let  x  =  Xi  and  y  =  yi  be  two  states  of  the  model.  Then, 

OO 

\F(x)-F(y)\  <  ^  |A(xi_i  -  aJi)[l{T>i— i}(l  +  ®t)  +  l{T<i-i}(^-i  +  £*)]  -  (xi  -  xi+i) 
1=1 

-  Kvi-i  -  yi)[i{T>*-i}(i  +Vt)  +  i{T<i-i}(yi-i  +  Vi)\  +  {y%-yi+ 1)|, 


oo  oo 

<2^| xt,  -  yi\  +  2A^[l{T>i_i}(|xi  -  yi\  +  \xiXT  -  yiyT\ ) 
i= 0  i= 0 

+  l{T<i-l}lX?  —  Vi\]l 

OO  oo 

<  2^1  Xi  -  yi\  +  2A^[l{T>i_1}(|xi  -  yi\  +  |x?  -  yf\) 

i= 0  i= 0 

+  ^{T<i~l}\xi  ~  Vi\], 

OO 

<  ^(2  +  l(T>j_i}6A  +  l{T<i-i}4A)|xj  —  yi\. 

i= o 

Note  that  0  <  Xi,  y%<  1.  □ 


4.2  Analysis  of  the  Fixed  Point 

Proof  of  Theorem  3  &  Lemma  1.  Recall  that  the  fixed  point  of  the  ODE  (3)  associated 
with  the  threshold  supermarket  model  satisfies  the  following  algebraic  recursion  equations: 


j  Si 

si+l  —  S 

[Si 

50  =  1, 

51  =  A. 


A(sj_i  —  Sj)(l  +  sT) 
A (s-_!  -  si) 


ifT  >  i  -  1 
if  T  <  i  -  1 


for  i  >  2, 
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4.2.1  Step  1:  Finding  s*  for  i  <  T  +  1 

We  begin  by  solving  the  recursion  for  T  <  i  —  1,  assuming  st  is  given. 

Note  that  we  have  a  linear  homogeneous  recurrence  relation  with  constant  coefficients 
of  the  form  sn+i  =  cisn  +  C2Sn_i  where  ci  and  C2  G  M  and  C2  /  0.  For  such  a  relation, 
suppose  x2  =  c\x  +  C2  has  two  distinct  roots  ri  and  r2 .  Then  we  know  {sn+i}  is  a  solution 
of  the  recursion  sn+i  =  cisn  +  C2Sn_i  if  and  only  if  sn+i  =  air”  +  a2r?>*  for  n  6  N  and 
some  constants  ai,  a-i . 


So,  from  our  recursive  relation  (s^+i  =  (1  +  A(1  +  sT))sj  —  (A(l  +  sT))sj_i),  we  know 
that  ci  =  1  +  A(1  +  st)  and  C2  =  — A(1  +  sT).  Then  we  examine  the  associated  quadratic 
equation: 

x2  —  (1  +  (A(l  +  sT))x  +  A(1  +  sT)  =  0, 
x2  —  x  —  xA(l  +  sT)  +  A(1  +  sT)  =  0, 
x(x  —  1)  —  A(1  +  sT)(x  —  1)  =  0, 

[x  -  A(1  +  sT)][.x  -  1]  =  0, 

And  so,  ri  =  A(1  +  sT)  and  r2  =  1. 


(Note  that  we  now  assume  that  A(1  +  st)  ^  1.  In  what  follows  st  will  be  determined 
by  an  implicit  equation  involving  A-  we  will  later  investigate  for  what  values  of  A  this 
condition  is  satisfied.) 


We  must  now  find  ai  and  such  that  Si 


We  know:  so  =  1  =  aq  +  ct2,  and  si  =  A 
By  substitution: 


A  = 
ai(l 
ai  = 


—  A(1  +  St))  —  1 

1-A(T+,t)  and  «2 


ai, 

A, 

A— A(1+st) 
1— A(1+st)  ' 


—  ai(A*(l  +  St)*)  +  OL2- 
=  ai(A(l  +  sT))  +  OL2- 


We  have  now  solved  the  first  part  of  our  recursion: 


Si 


1  -  A 

1  —  A(1  +  sT) 


a*(i +sTr  + 


A  —  A(1  -(-  St) 
1  —  A(1  +  sT) 


for  2  <  i  <  T  +  1 


4.2.2  Step  2:  Finding  sT 

We  can  now  use  this  equation  to  solve  for  sT.  By  substituting  i  =  T  in  the  above  boxed 
expression  for  Sj,  we  see  that  sT  must  satisfy: 

(A  —  l)Ar(l  +  sT)r  —  A-s|  +  sT  =  0. 
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Although  we  cannot  explicitly  solve  this  for  sT  for  a  general  T,  we  can  find  important 
properties  that  reveal  how  the  equation  behaves  by  arranging  it  in  the  following  convenient 
way: 


(1  + 


sT  —  As^ 
AT(1  -  A)' 


As  in  equation  (4),  we  define  H\  to  be  the  left-hand  side  and  H2  to  be  the  right-hand  side. 
It  is  intuitively  obvious  that  the  H\  is  increasing  and  convex  in  the  interval  [0,  1],  and  the 
H2  is  an  inverted  parabola.  The  graph  of  the  H\  and  H2  looks  generally  like  this: 


Now  that  we  have  this  insight,  we  return  to  the  analytics  and  find  that: 
1.  At  sT  =  0: 

i.  H\  (0)  >  H2( 0). 

Proof:  tfi(0)  =  1  >  H2{ 0)  =  0. 


ii.  H'2{ 0)  >  H[{ 0) 

Proof:  H'2( 0)  =  =  Ar(j_A).  H[( 0)  =  T(1  +  8t)t~1  =  T.  Note  that 

for  0  <  A  <  1,  AT(1  —  A)  >  0.  So,  proving  that  H'2  >  H[  is  equivalent  to  proving  that 
A  >  Ar(l  —  A).  And  so,  if  A  always  exceeds  the  maximum  of  A7(l  —  A),  then  we  have 
completed  our  task. 


Finding  the  max:  d^X  ^A  =  —XT  +  (1  —  A)TAt  1  =  0, 
T(1  —  X)XT~1  =  XT, 

T  -  AT  =  A, 

A(l  +  T)  =  T, 

\  max  T 

A  _  T+l  * 


Plugging  Amax  in, 
finished.  Because 


the  maximum  is 


T+l 


T  r 


T+l 


<  1  and  T  >  2,  we  know 


T+l 


So,  if  7^  > 


i  T 


T+l 


T+l 


,  we  have 


1  T 


<  1  as  well.  Additionally,  because 
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T  +  1  is  larger  than  T,  we  know  T^+{  <  j,.  Therefore,  we  know  T’ j  is  the  prod¬ 

uct  of  a  quantity  that  is  less  than  ^  and  a  quantity  that  is  less  than  1,  thereby  making  it 
necessarily  less  than  ^ .  Therefore,  ^  is  clearly  greater  than  XT  (1— A).  Our  task  is  complete. 

hi.  ^(0),F'(0)  >0. 

Proof:  -ffi(O)  =  T(1  +  sT)r_1  =  T.  Because  T  >  2,  it  is  obvious  that  T  >  0  and 
therefore,  H[( 0)  >  0. 

H2  =  =  y  y-  \)  •  Because  0  <  A  <  1 ,  it  is  clear  that  both  XT  and 

(1  —  A)  are  positive.  Thus,  both  the  denominator  and  numerator  are  positive.  Because  the 
quotient  of  two  positive  numbers  is  always  positive,  H!2(0)  >  0. 

2.  H 1  begins  at  ifi(0)  =  1  and  ends  at  H\(l)  =  2T  and  is  positive,  increasing,  and 
convex  for  all  sT  in  the  interval  [0,  1]. 

i.  H 1  >  0  for  all  sT. 

Proof:  Because  sT  £  [0,  1],  we  know  H\  e  [1,  2T ]  >  0. 

ii.  H[  >  0  for  all  sT. 

Proof:  H[  =  T(1  +  st)t_1.  Because  T  >  2,  it  is  obvious  that  T  >  0,  and  because 
st  £  [0,  1],  it  is  obvious  that  (1  +  sT)  €  [1,  2]  and  therefore  (1  +  st)T  1  >  0.  Because  the 
product  of  two  positive  quantities  is  always  positive,  H[  >  0  for  all  sT. 

iii.  H’{  >  0  for  sT. 

Proof:  H”  =  T(T  —  1)(1  +  st)t~2.  Based  on  the  same  reasoning  as  above,  we 
know  that  T  >  2,  (T  —  1)  >  1,  and  (1  +  st)t~2  >  1.  Again,  because  the  product  of  only 
positive  numbers  is  always  positive,  H "  >  0  for  all  sT. 

3.  H-2  begins  at  #2(0)  =  0,  increases  from  sT  =  0  to  some  critical  point  s™ax,  de¬ 
creases  thereafter,  and  ends  at  #2(1)  =  yr-  Additionally,  H2  is  convex  for  all  sT. 

i.  H!2  >  0  for  sT  €  [0, 

Proof:  First,  we  must  find  the  critical  point  s“iax. 

At  the  maximum,  H2  =  =  0-  So  we  must  find  1  —  2AsT  =  0,  which 

occurs  when  ,s’™ax  = 

Now  let  us  examine  H2  for  sT  6  [0,  saiax):  H2  =  Because  0  <  A  <  1, 

it  is  clear  that  AT(1  —  A)  >  0.  Therefore,  H2  is  positive  when  1  —  2AsT  >  0.  When  %-  <  yy, 
1  >  2Ast  and  therefore  H2  >  0  in  the  interval  [0,  saiax). 
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ii.  H2  <  0  for  sT  G  (  s™3*,  1]. 

Proof:  H2  =  •  By  the  same  reasoning  as  above,  because  the  denominator 

is  always  positive,  H2  is  only  negative  if  1  —  2AsT  <  0.  For  sT  >  7^,  it  is  obvious  that  this 
inequality  holds  and  therefore,  H2  <  0  for  sT  G  (  s™3*,  1]. 

iii.  H2  <  0  for  all  sT- 

Proof:  H2  =  yrjjz  \)  ■  Because  0  <  A  <  1,  it  is  clear  that  — 2A  <  0  and 

AT(1  —  A)  >  0.  Because  a  negative  quantity  divided  by  a  positive  quantity  is  always  nega¬ 
tive,  it  is  clear  that  H2  <  0  for  all  sT. 


All  of  these  properties  combined  tell  us  that:  if  at  some  sT.  >  0,  H2  >  Hi, 
then  the  H\  and  H2  at  least  once  and  at  most  twice  in  the  interval  [0,  1].  Fur¬ 
thermore,  we  know: 

(A)  When  H\  >  h2,h'2  >  H[ ,  and  H\ .  H2  >  0,  there  have  been  zero  intersections. 

(B)  When  H2  >  Hi,  there  has  been  exactly  one  intersection. 

(C)  When  Hi  >  H2,H!2  <C  H | ,  and  there  has  been  at  least  one  point  below  St  at  which 
Hi  <  H2,  there  have  ben  exactly  two  intersections. 


So,  we  come  to  the  following  conclusions: 

As  proven  in  (1),  at  sT  =  0,  there  have  been  exactly  zero  intersections. 

Now,  we  will  examine  the  equation  at  sT  =  1  : 

I.  If  A  H2  >  Hi  and  therefore,  in  the  interval  [0,  1],  there  is  exactly  one 
intersection. 

Proof:  H]  (1)  =  2T.  H2(l)  =  =  -j^.  We  must  show  that  2T  <  which  is 

the  same  as  showing  (2A)T  <  1.  Because  A  <  we  know  2 A  <  1  and  therefore, (2 A)T  < 
l7  =1.  So,  the  inequality  holds. 

II.  If  A  >  2:  Hi  >  H-2,  H2  <  H[,  and  there  has  been  at  least  one  point 
below  sT  at  which  Hi  <  H2  and  therefore,  in  the  interval  [0,  1],  there  have  been 
exactly  two  intersections. 

Proof:  i.  Hi(l)  =  2r.  H2(l)  =  x j  =  yr-  We  must  show  that  2T  >  -^r, 

which  is  the  same  as  showing  (2X)T  >  1.  Because  A  >  5,  we  know  2A  >  1  and  therefore, 
(2A)t  >  1T  =  1.  So,  the  inequality  holds. 

ii-  H!2  =  =  _^_2A_  H>  =  T(1  +  St)t-  1  =  t(2)t- i_  Because 

T  >  2,  •  -  4,  which  is  positive.  Because  A  27  it  is  clear  that  H2  0.  Therefore, 

H\  >  H!2. 
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iii.  sT  =  ^2  _2a4  <  1,  and  we  know  that  at  this  point  sT,  H\  =  H-2 

(see  Section  4.2.3).  Therefore,  there  has  been  at  least  one  point  below  sT  at  which  H\  <  H2. 

III.  If  A  =  |:  Hi  =  H2,  H[  >  H'2,  and  there  has  been  at  least  one  point  below 
sT  at  which  H\  <  H2  and  therefore,  in  the  interval  [0,  1] ,  there  have  been  exactly 
two  intersections.  One  intersection  at  sT  =  1  and  one  in  the  interval  [0,1). 

Proof:  i.  #i(l)  =  2T  and  H2{  1)  =  ^  =  2r. 

2 

ii.  H!2  =  yt^t  =  0  and  H[  =  T(2)t-4  >  0.  Therefore,  H[  >  H'2. 

2 

iii.  Because  Hx  (0)  >  H2{ 0),  H[{ 0)  <  H'2{ 0),  H2{  1)  =  Hi(l),  H'2(  1)  =  0, 
and  1)  >  0,  there  must  be  one  point  below  sT  =  1  at  which  the  H 1  and  H2  intersect. 

4.2.3  Step  3:  A  Closer  Examination  of  A  >  \  (Finding  the  T-Independent 
Root) 

To  better  understand  the  case  when  A  >  | ,  we  begin  by  looking  at  the  case  that  is  easy  to 
analyze  ( T  =  2).  Plugging  in  T  =  2,  we  are  left  with  a  simple  quadratic  equation: 

(A2  -  A3  +  A)s2  +  (2A2  -  2A3  -  l)sT  +  (A2  -  A3)  =  0. 


By  using  the  quadratic  formula,  we  find  the  following  two  roots: 


n 

?’2 


A3  -  2A2  +  1 
A2  -  A3  +  A 
A2 

1+A-A2  ' 


1— A 
A  » 


By  plugging  each  of  these  two  roots  back  into  the  equation  with  general  T,  we  find 
that  ?’i,  which  we  define  as  sT,  is  in  fact  a  root  for  all  T. 


Proof:  By  plugging  this  sT  into  (A  —  1)AT(1  +  st)t  —  A-s2  +  sT  =  0,  we  find: 


AT(A  -  1) 


’  A2  -  A3  +  A  +  A3  -  2A2  +  1  ’ 

T 

-A 

'A6  -  4A5  +  4A4  +  2A3  -  4A2  +  1' 

A(A  -  A2  +  1) 

A(A5  -  2A4  -  A3  +  2A2  +  A) 

’  A3  -  2A2  +  1  ’ 
_A(A-  A2  +  l) 
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'A6  -  4A5  +  4A4  +  2A3  -  4A2  +  1' 

_1_ 

’  A3  -  2A2  +  1  ’ 

A5  -  2A4  -  A3  +  2A2  +  A 

1 

A(A  —  A2  +  1) 

'A6  -  4A5  +  4A4  +  2A3  -  4A2  +  1' 

_1_ 

A4  —  3A3  +  2A2  +  A  —  1' 

A5  -  2A4  -  A3  +  2A2  +  A 

1 

A  -  A2  +  1 

_  '(A6  -  4A5  +  4A4  +  2A3  -  4A2  +  1)(A  -  A2  +  1)' 

(A5  -  2A4  -  A3  +  2A2  +  A)(A  -  A2  +  1) 

T (A4  -  3A3  +  2A2  +  A  -  1)(A5  -  2A4  -  A3  +  2A2  +  A)' 
+  [  (A5  -  2A4  -  A3  +  2A2  +  A)(A  -  A2  +  1) 

=  [A8  -  5A7  +  7 A6  +  2A5  -  10A4  +  2A3  +  5A2  -  A  -  l] 

-  [A8  -  5A7  +  7A6  +  2A5  -  10A4  +  2A3  +  5A2  -  A  -  l]  , 


=  0. 

However,  the  root  sT  is  not  a  valid  root  since  A(l  +  sT)  =  1,  which  violates  the  condition 
found  earlier  that  A(1  +  sT)  ^  1.  Therefore,  the  root  we  are  interested  in  should  satisfy  the 
algebraic  identity: 


(A— 1)At(1+st)t  -  As|  +  sT  _  ^ 
st  —  st 

But,  in  what  interval  will  the  unique  solution  lie?  The  numerics  show  that  for  some 
values  of  T,  sT  is  the  larger  of  the  two  roots,  while  for  other  values  of  T,  sT  is  the  smaller 

of  the  two  roots.  So,  is  there  a  simple  expression  to  determine  for  which  T,  the  stable  root 

sT*  will  be  in  the  interval  [0,  sT)  and  for  which  in  the  interval  (sT,  1]?  The  answer  lies  in 
the  derivative. 

If  we  define:  G(T,  sT )  =  (A  —  l)Ar(l  +  st)t  —  As2  +  sT, 

then,  G'(T ,  sT)  =  Ar(l  —  A)T(1  +  st)t”4  +  2AsT  —  1. 

By  examining  G'  at  the  point  sT,  we  find  that  G'(T,  sT)  >  0  when  T  >  A2^7^) 
and  G'(T,  sT )  <  0  when  T  A2^7^  •  Thus,  we  now  have  our  answer.  In  order  to  solve  for  the 
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unique  point  sT*  which  our  system  converges  to  when  A  >  we  must  find  the  solution  to: 
(A  —  l)Ar(l  +  st)t  —  As^  +  Si 


St 


A3  -  2A2  +  1 
A2  —  A3  +  A 


i)  [0,  sT)  if  T  > 

ii)  (sT,  1]  if  T  < 


=  0,  in  the  interval 


2A  -  1 


A(1  —  A)  ’ 
2A  —  1 
A(l-A)' 


Therefore,  we  now  have  an  algorithm  for  finding  the  unique  point  (sT*)  our  system  con¬ 
verges  to  for  all  A: 

1)  If  A  <  — ,  simply  solve  for  sT*  such  that  (A  —  1)AT(1  +  st)T  —  As^  +  sT  =  0  | 
in  the  interval  [0,  1]. 


2)  If  A  =  — ,  solve  for  sT*  such  that 


(A  —  l)Ar(l  +  st)t  —  As^  +  sT 


=  0 


St  St 


in  the  interval  [0,  1). 


3)  If  A  >  — ,  solve  for  sT*  such  that 


(A  —  l)Ar(l  +  st)t  —  As^  +  sT 


=  0 


St  St 

2A  -  1 

i)  in  the  interval  [0,  sT)  if  T  > 


ii)  in  the  interval  (sT,  1]  if  T  < 


A(1  —  A)  ’ 
2A  -  1 

AO^A)' 


This  proves  Lemma  1. 


4.2.4  Step  4:  Finding  s*  for  i  >  T  +  1 

Now  that  we  know  sT  and  s*  for  i  <  T  +  1,  we  can  find  Si  for  i  >  T  +  1.  Recall  that  we 
will  be  solving  the  following  recursive  relation  with  initial  conditions  sT  and  sT+1: 

Si+i  =  Si  -  A (sf_!  -  sj)  for  i>  T  +  1. 

We  will  begin  by  proving  a  simple  and  convenient  relation  between  sT  and  sT+i  through 
taking  advantage  of  a  telescoping  sum. 
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From  Sj+i  —  Si  =  A(sj_i  —  Sj)(l  +  sT)  for  i  <  T  +  1,  we  know  that: 

T 

Si)  —  (s 2  —  -Si)  +  («3  —  S 2)  +  (S4  —  S3)  +  .  .  .  +  (sT+i  —  ST)  —  ST+1  —  Si  —  ST+1  —  A. 

i= 1 

T 

And  A(sj_i  —  Sj)(l  +  sT)  =  A(1  +  sT)[(si  —  so)  +  (S2  —  si)  +  (S3  —  S2)  +  . . .  +  (sT  —  St^)], 
1=1 

=  A(1  +  sT)(sT  —  so)  =  A(1  +  sT)(sT  —  1). 

Therefore,  sT+i  —  A  =  A(sT  +  l)(sT  —  1)  =  A(s^,  —  1),  and 


We  now  return  to  the  current  recursion.  In  a  similar  fashion,  we  take  advantage  of 
a  telescoping  sum  (from  T  +  1  to  an  integer  N  that  is  sufficiently  large)  to  simplify  the 
expression  in  question. 


From  Sj+i  —  Sj  =  A(s^_x  —  s2)  for  i  >  T  +  1,  we  know  that: 
N 


y  ]  (si+i  si 

)  —  (ST+2 

—  ST+i)  +  (sT+3  —  sT+2)  +  . . . 

■  +  (* 

’N+l  SN) 

—  sN+1 

St+1- 

i=T+ 1 

N 

And  J^A(sf_1 

-  sf)  = 

A[(st+i  S^)  A  (st+2  st+i. 

)  +  ■• 

'  '  +  (SN  _ 

-  4J]  = 

K4- 

T+l 

Therefore,  sN+1 

II 

+ 

1 

A-St)  +  As^. 

And  by  plugging  in  sT+1  = 

=  A«t  ,we  find  that 

Sn+1  — 

Asn 

To  simplify  this  recursion,  we  will  transform  it  by  taking  the  logarithm  base  A  and  sub¬ 
stituting  in  an  =  log\SN ■  Thus,  we  come  to  the  following  linear  homogeneous  recurrence 
relation  with  constant  coefficients,  which  we  can  solve  using  the  same  method  as  Step  1. 


Q"n+ 2  —  3®n+ 1  2an. 

Recall  that  we  first  must  find  the  roots  of  the  following  equation:  x2  —  3x  +  2  =  0.  The 
resulting  roots  rq  =  2  and  =  1  lead  us  to  the  following  relationship:  a*  =  aq2*  +  a 2. 
And  substituting  back  in,  we  find  that  log\Si  =  oq2*  +  oq- 

To  solve  for  ctq  and  a 2,  we  use  the  initial  conditions  and  the  fact  that  sT+1  =  As2,  which 
give  us  the  following  three  equations: 

log\sT  =  aq2r  +  a2, 

<  log\sT+1  =  aq2r+1  +  a2, 

,  ^o^aSt+i  =  1  +  2Zo(/aSt- 
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Through  substitution,  we  see  that  a\  =  1+l°^xST  and  oli  =  — 1,  and  therefore  log\Si  = 
(1  +  log\sT)2l~T  —  1.  By  taking  the  exponential  of  both  sides,  we  come  to  our  answer: 


4.2.5  Conclusion 


We  have  now  proven  equation  (5).  Given  sT* ,  the  fixed  point  of  our  system  is: 


Using  arguments  similar  to  those  used  in  [1],  it  can  be  shown  that  the  limiting  stationary 
distribution  {vr^ eclual  t°  this  computed  fixed  point  □ 


5  The  Threshold  Supermarket  Game 

5.1  Description  of  the  Model 

The  classic  load  balancing  problem  traditionally  focuses  on  models  with  a  central  dispatcher 
that  distributes  load  across  the  system.  Often  times,  however,  customers  themselves  must 
decide  which  queue  to  join,  and  most  often,  they  prefer  queues  of  shorter  length.  However, 
because  sampling  queues  is  costly,  customers  must  find  the  right  balance  between  probing 
enough  queues  to  find  one  that  is  sufficiently  short  but  not  so  many  that  the  cost  from 
sampling  outweighs  the  benefit.  Because  an  arriving  customer’s  expected  queue  length 
depends  on  the  choices  of  all  preceding  customers’  choices,  this  problem  is  a  game  among 
customers. 

The  classic  supermarket  game,  with  Poisson  arrivals  to  n  parallel  FIFO  queues  and 
i.i.d.  exponential  service  times  with  mean  g  =  1,  has  previously  been  analyzed  by  Xu  and 
Hajek  [16]  in  their  paper  titled  “The  Supermarket  Game.”  A  summary  of  their  paper  can 
be  found  in  Section  5.3.  We  aim  to  expand  upon  this  and  look  at  a  variation  of  the  super¬ 
market  game  in  which  d,  the  number  of  queues  sampled  by  each  customer,  is  dependent  on 
the  length  of  the  initial  queue  observed  and  some  threshold  T,  following  the  same  dynamics 
defined  in  equation  (2).  Upon  entering  the  system,  each  customer  arrives  to  her  dedicated 
queue  i.  If  the  customer’s  dedicated  queue  has  length  q  which  is  greater  than  or  equal 
to  some  threshold  T,  an  alternate  queue  is  sampled,  and  the  customer  joins  the  shorter 
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queue.  If  the  queue  length  q  is  less  than  the  threshold  T,  the  customer  simply  joins  this 
queue,  bypassing  the  extra  work  and  cost  of  an  additional  search.  Note  that  the  dynamics 
of  this  system  are  equivalent  to  those  of  the  threshold  supermarket  model  described  above 
(in  which  the  initial  queue  is  selected  uniformly  at  random  upon  arrival);  however,  this 
set-up  proves  more  relevant  to  this  associated  game. 

Customers  must  choose  a  strategy  to  determine  what  threshold  T  they  will  use  when 
arriving  to  the  system.  A  customer  can  devise  a  pure  strategy  or  a  mixed  strategy.  A  pure 
strategy  for  customer  i  corresponds  to  a  sampling  strategy  d(q,  Tj),  for  some  deterministic 
threshold  choice  T)  £  {0, 1,  2, 3, . . .}.  Alternatively,  a  customer  can  devise  a  mixed  strat¬ 
egy,  which  assigns  a  certain  probability  to  each  possible  pure  strategy.  In  other  words,  a 
customer  can  randomize  over  the  set  of  pure  strategies  in  order  to  choose  a  threshold  value. 
In  this  case,  customers  are  assumed  to  use  the  strategy  jii —  namely,  they  randomize  over  a 
set  T  of  a  finite  number  of  thresholds,  where  the  probability  that  player  i  uses  the  sampling 
strategy  d(q ,  Tj)  with  threshold  T)  is  /ij(T)).  We  will  only  consider  pure  strategies. 

Note  that  when  all  customers  use  the  same  pure  strategy,  namely  the  sampling  strategy 
d(q,  Tj )  for  some  deterministic  Tj  £  N,  the  game  will  have  a  limiting  stationary  expected 
queue  length  distribution  that  corresponds  to  that  of  our  basic  threshold  model  described 
in  Section  4.  Additionally,  note  that  if  all  customers  use  the  optimal  pure  strategy  thresh¬ 
old  T*  found  in  Section  3.3,  the  resulting  cost  coincides  with  a  socially  optimal  equilibrium. 
That  is,  when  all  customers  follow  strategy  T* ,  the  resulting  equilibrium  is  the  solution  to 
what  is  called  the  “social  planner  problem”  in  economics.  As  the  name  implies,  the  “social 
planner  problem”  seeks  to  find  the  equilibrium  that  is  collectively  the  best  for  all  players. 
In  this  previous  analysis  of  the  socially  optimal  equilibrium,  we  assumed  the  existence  of 
a  centralized  social  planner  who  imposed  a  universal  optimal  threshold  on  the  system.  We 
now  shift  our  focus  to  the  more  complicated  decentralized  setting  in  which  each  customer 
follows  her  own  strategy  for  choosing  a  threshold. 

We  consider  the  situation  where  a  customer  arriving  to  queue  i  uses  the  pure  strategy 
associated  with  the  threshold  Tj,  while  all  other  customers  use  the  pure  strategy  corre¬ 
sponding  to  a  (possibly)  different  threshold  T_j.  Let  {sj(T_j)}  be  the  limit  stationary 
expected  queue  length  distribution  corresponding  to  the  threshold  model  with  threshold 
T_j.  Then  the  distribution  of  the  remaining  queues,  as  observed  by  any  customer  i,  will 
be  {sj(T_j)},  and  this  will  determine  the  distribution  of  the  queues  sampled  by  customers 
joining  queue  i.  Also,  let  Sj(T),  T_j)  be  the  stationary  distribution  of  queue  %  when  cus¬ 
tomers  arriving  at  queue  i  use  the  threshold  Tj,  and  customers  arriving  at  all  other  queues 
use  threshold  T_j.  Then  the  cost  to  customers  arriving  to  queue  i  is  given  by 

C*(Tj,  T_j)  =  E[queue  length]  +  csE[number  of  extra  searches], 

00  (7) 

=  Sj(Tj,  T-i)  +  cssT.  (T-i). 

i= 1 
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For  simplicity,  assume  we  are  given  lower  and  upper  thresholds  0  <  Tmin  <  Tmax  <  oo, 
such  that  the  thresholds  T*  and  T_*  must  he  in  [Tmin ,  Tmax ]• 

Let  TNE  denote  the  pure  strategy  Nash  equilibrium  of  our  threshold  supermarket  game. 
A  threshold  TNE  is  a  pure  strategy  Nash  equilibrium  if 

C(Tne,Tne)  <  C(Tj,  TNE),  for  all  T{  in  {Tmin,  ....  Tmax}. 


5.2  Main  Results 

Recall  that  the  socially  optimal  strategy  results  in  an  equilibrium  that  makes  all  players 
collectively  better  off,  and  the  Nash  equilibrium  is  that  which  is  reached  when  each  player 
acts  selfishly,  choosing  a  strategy  that  makes  only  one’s  self  better  off,  regardless  of  the 
effect  on  others.  It  is  obvious  that  policymakers  desire  a  system  to  reach  the  socially 
optimal  equilibrium.  Accordingly,  policymakers,  often  times,  put  incentives,  such  as  taxes 
or  subsidies,  into  place  to  tempt  players  to  behave  in  such  a  way  that  will  allow  the  system 
to  reach  the  socially  optimal  equilibrium. 

With  this  in  mind,  we  ask:  Given  a  system’s  cost  of  sampling  cs,  is  it  possible  to  reach 
the  social  optimum  in  this  decentralized  setting  by  providing  incentives  to  customers?  We 
aim  to  find  some  cs ,  where  |c(  —  cs|  will  serve  as  either  a  subsidy  or  tax,  such  that  when 
imposed,  there  exists  a  Nash  equilibrium  strategy,  which  we  will  call  TNE(cs),  that  leads 
to  an  equilibrium  average  total  cost  C(TNE,  TNE)  equal  to  that  of  the  social  optimum.  In 
other  words,  we  seek  some  cs  which  leads  customers  to  self-organize  in  such  a  way  that 
allows  the  system  to  reach  its  social  optimum. 

To  complete  this  analysis,  we  must  first  understand  the  distribution  Sj.  It  is  easy  to 
see  that  Si  is  the  stationary  distribution  associated  with  a  birth-death  process  and  be 
represented  by  the  following  recursion: 

so  =  1, 

si  =  A, 

*  (  A(sj_i(T_i)  -  Sj(r_j))(  1  +  sTj(T_j)),  for  1  <  *  <  Tj, 

Si(Ti,  T-i)  -  s;..  l  ( //,  T-i)  =  <  A(s?_i(T_i)  -  s?(T_i)),  for  i  >  T{. 


Note  that  when  T*  =  T_j  =  T,  this  reduces  to  the  algebraic  equations  (3)  that  characterize 
the  fixed  point  for  the  supermarket  model  with  threshold  T.  (But  they  are  different  when 
Ti  +  T^.) 

To  find  Si,  one  must  simply  take  the  sum  of  the  above  quantity  from  i  to  infinity.  By 
taking  advantage  of  the  resulting  telescoping  sum,  the  distribution  for  i  >  1  simplifies  to: 


A(1  +  sT.  )(sj_i  —  sT.)  +  A  <s2; 
As?_! 
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for  1  <  i  <  Ti, 
for  i  >  Ti. 


(8) 
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Using  this  equation,  for  a  given  cost  cs,  we  can  now  find  cs  which,  when  imposed,  leads  to 
a  Nash  equilibrium  total  average  cost  which  coincides  with  the  socially  optimal  total  cost 
(or  (7(TNE,  TNE)  =  C(T* ,  T*)).  The  following  table  displays  what  we  find. 

Table  5:  Social  Optimum  Costs  Coinciding  with  Nash  Equilibrium  Costs 


Equil.  Cost 


0.40 


0.50 


0.70 


0.80 


0.90 

0.95 


0.1 

0.2 

0.3 

0.4 

0.5 

0.7 

0.8 

0.9 

1.1 

1.3 

1.4 

1.5 

1.6 
1.7 
0.1 
0.2 
0.3 
0.4 
0.5 
0.6 
0.9 
1.0 

1.5 

1.6 
1.7 
0.1 
0.2 
0.4 
0.5 
0.6 
1.0 
1.1 
1.4 
0.1 
0.4 
0.5 
0.7 
1.0 

N/A 

N/A 


1 

1 

2 

2 

2 

3 

3 

3 

4 

4 

5 
5 
5 
5 
1 
1 
2 
2 
2 
2 
3 

3 

4 

5 
5 
1 
1 
2 
2 
2 
3 

3 

4 
1 
2 
2 
3 
3 

N/A 

N/A 


0.1 

0.2 

0.3 

0.4 

0.5 

0.7 

0.8 

0.9 

1.2 

1.3 

1.7 

1.7 

1.7 

1.8 
0.1 
0.2 
0.3 
0.4 
0.5 
0.6 
0.9 
1.0 

1.5 
2.0 
2.0 
0.1 
0.2 
0.4 
0.5 
0.6 
1.0 
1.1 

1.6 
0.1 
0.4 
0.5 
0.8 

1 

N/A 

N/A 


1 

1 

2 

2 

2 

3 

3 

3 

3 

4 
4 
4 
4 
4 
1 
1 
2 
2 
2 
2 
3 
3 

3 

4 
4 
1 
1 
2 
2 
2 
3 
3 
3 
1 
2 
2 
2 
3 

N/A 

N/A 


0.5056 

0.5456 

0.5744 

0.5873 

0.6002 

0.6259 

0.6303 

0.6348 

0.6477 

0.6510 

0.6571 

0.6577 

0.6584 

0.6590 

0.6828 

0.7328 

0.7802 

0.8002 

0.8202 

0.8402 

0.8900 

0.8983 

0.9399 

0.9575 

0.9592 

1.2001 

1.2701 

1.391 

1.4315 

1.472 

1.6252 

1.6475 

1.7583 

1.6379 

1.8652 

1.9204 

2.0858 

2.1909 

N/A 

N/A 


Thus,  for  certain  given  “actual”  marginal  costs  of  sampling  cs,  we  have  identified 
corresponding  “imposed”  marginal  costs  of  sampling  cs  such  that  the  social  optimum  for  cs 
coincides  with  the  Nash  equilibrium  for  cs.  The  difference  cs  —  cs  can  be  viewed  as  a  tax  (or 
subsidy)  that  a  system  operator  levies  on  individuals  so  that  when  each  customer  behaves 
selfishly  to  minimize  her  cost,  the  system  converges  to  the  social  optimum  corresponding 
to  the  marginal  sampling  cost  cs. 

These  results  also  indicate  that  in  particular  cases,  the  socially  optimal  strategy  is 
already  a  Nash  equilibrium  (without  any  manipulations  on  sampling  costs).  In  other  words, 
cs  =  cs.  The  fact  that  the  social  optimum  strategy’s  equilibrium  is  also  a  Nash  equilibrium 
implies  that  no  intervention  is  necessary  for  the  system  to  reach  the  optimal  steady-state 
distribution.  This  is  very  interesting.  Note,  however,  that  for  high  arrival  rates  (A  >  0.90), 
the  social  optimum  never  coincides  with  a  Nash  equilibrium.  Additionally,  note  that  we 
only  considered  sampling  costs  in  the  interval  [0,2]  in  increments  of  0.10. 
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Given  a  cost  of  sampling  and  assuming  all  other  customers  are  following  the  socially 
optimal  strategy,  Figures  9  and  10  show  how  the  Nash  equilibrium  strategy  for  player  i 
compares  to  the  socially  optimal  strategy.  In  other  words,  given  cs  and  assuming  T_j  =  T* , 
the  following  figures  show  (ZfE  —  T*).  If  Tf^  /  T*.  we  refer  to  this  as  a  deviation. 

2.5 


2  ♦  ♦ 


* 

I-  1.5 

E 

e 


i 


♦  ♦ 


♦  ♦  ♦  ♦  ♦♦♦♦♦♦♦ 


♦ 


<>♦  ♦  ♦  ♦ 

0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  1.1  1.2  1.3  1.4  1.5  1.6  1.7  1.8  1.9  2 

Cost  of  Sampling 

Figure  9:  Comparison  of  T”E  to  T*  for  A  =  0.80 


(a)  A  =  0.40 


(b)  A  =  0.50 


(c)  A  =  0.70 


(d)  A  =  0.80  (e)  A  =  0.90  (f)  A  =  0.95 

Figure  10:  General  Comparison  of  T^E  to  T* 

Further  work  can  be  done  to  solve  for  all  Nash  equilibria  of  this  game  for  all  sampling  costs. 
Additionally,  further  work  can  be  done  to  explore  uniqueness  of  these  Nash  equilibria. 
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5.3  Prior  Work 

5.3.1  Basic  Game  -  The  Supermarket  Game 

In  a  paper  titled  “The  Supermarket  Game,”  Xu  and  Hajek  [16]  analyze  a  classic  super¬ 
market  game.  The  classification  of  “game”  stems  from  the  facts  that  the  waiting  time 
of  a  customer  depends  on  other  customers’  choices  and  that  often  customers  attempt  to 
find  the  shortest  queues  to  join.  The  basic  model  remains  the  same:  customers  arrive 
according  to  a  Poisson  process,  N  processors  serve  customers  with  a  FIFO  discipline  with 
exponentially  distributed  service  times,  and  ties  are  broken  uniformly  at  random.  Xu  and 
Hajek  add  costs  to  the  model.  Each  customer  faces  both  a  cost  for  waiting  time  and  for 
sampling.  Customers  are  assumed  to  be  selfish  and  aim  to  minimize  their  own  total  cost. 
However,  the  duo  also  analyzes  the  case  of  the  social  optimum,  in  which  the  total  cost  of 
all  customers  is  minimized.  Xu  and  Hajek  seek  to  determine  what  is  the  optimal  number  of 
queues  to  sample  (when  a  customer  is  selfish)  and  whether  some  customers  sampling  more 
queues  is  advantageous  or  harmful  to  other  customers.  The  latter  question,  regarding  the 
externality  of  sampling  more  queues,  is  not  immediately  obvious.  It  is  true  that  sampling 
more  queues  will  lead  to  a  well-balanced  system  and  therefore  reduce  the  average  waiting 
time;  however,  it  also  prevents  future  customers  from  finding  less-loaded  queues. 

The  pair  first  examines  the  case  with  homogeneous  waiting  cost.  They  arrive  at  the 
following  conclusions: 

1.  There  always  exists  a  mixed  strategy  Nash  equilibrium  for  A  <  1. 

2.  The  Nash  equilibrium  is  unique  for  A  <  -4=. 

3.  The  Nash  equilibrium  is  unique  if  and  only  if  a  local  monotonicity 
condition  is  satisfied. 

4.  The  sampling  of  more  queues  by  some  customers  has  a  positive  ex¬ 
ternality  on  other  customers  in  the  mean  field  model.  This  implies  that 
no  more  queues  are  sampled  for  any  Nash  equilibrium  than  for  the  social 
optimum.  However,  the  sampling  of  more  queues  can  have  a  negative 
externality  in  the  case  of  finite  N  servers. 

5.  Multiple  Nash  equilibria  exist  for  a  particular  example  with  A  =  0.999. 

6.  The  Nash  equilibria  for  A  =  0.999  is  unique  if  customers  are  limited 
to  being  able  to  sample  only  one  or  two  queues. 

They  then  went  on  to  study  the  case  with  heterogeneous  waiting  times.  Specifically,  they 
utilized  a  non-degenerate  continuous  probability  distribution  function  to  determine  waiting 
cost.  A  proof  of  the  existence  of  a  pure  strategy  Nash  equilibria  was  achieved. 

In  conclusion,  a  Nash  equilibrium  always  exits  when  A  <  1  and  is  unique  for  A  <  i 
with  homogenous  waiting  costs.  The  idea  of  a  supermarket  game  is  truly  intriguing.  There 
is  very  little  literature  on  this  topic  currently,  leaving  much  room  for  expansion.  How 


41 


45 


do  these  results  change  with  general  service  times  or  heterogeneous  service  times?  Is  it 
possible  to  provide  an  incentive  that  will  move  the  Nash  equilibria  closer  to  the  social 
optimum  without  losing  throughput  or  hurting  efficiency?  What  happens  if  asymmetry  is 
introduced  to  the  customer  strategy?  Many  interesting  topics  remain  to  be  addressed  in 
this  field. 

5.3.2  Extensions  of  the  Basic  Game:  Queue-Length-Based  Scheduling 

Manjrekar,  Ramaswamy,  and  Shakkottai  [17]  examine  the  scheduling  game  of  smart  phone 
applications  competing  for  service  from  base  stations.  They  seek  to  find  a  scheduling 
algorithm,  through  auctions  in  which  the  players  compete  for  service,  that  will  replicate 
the  advantageous  results  of  the  Longest-Queue-First  (LQF)  policy.  As  one  might  think,  the 
LQF  policy  consists  of  each  server  servicing  the  longest  of  the  queues  requesting  service. 
The  major  benefit  of  such  a  policy  is  minimizing  the  expected  time  of  the  longest  queue  in 
the  system. 

To  try  and  replicate  the  LQF  policy’s  results,  Manjrekar  et  al.  analyze  a  second-price 
auction.  In  such  a  regime,  the  highest  bid  wins  and  is  then  charged  the  rate  of  the  second- 
highest  bid.  Such  an  auction  is  proven  to  promote  truth  in  bids  (or  preventing  lies  aimed 
at  “beating  the  system”).  In  the  proposed  model,  smart  phone  applications  send  service 
requests,  experience  a  waiting  cost,  and  bid  for  service  from  its  base  station.  In  order  to 
generate  a  bid,  each  player  models  opponents  through  a  distribution  over  the  possible  action 
spaces.  The  player  then  chooses  a  best  response.  A  mean-field  equilibrium  is  reached  if  this 
best  response  can  be  found  in  the  distribution.  Manjrekar  et  al.  seek  to  find  an  equilibrium 
in  the  auction  that  will  replicate  the  beneficial  results  of  the  LQF  policy. 

Through  a  mean-field  analysis  of  this  scheduling  protocol  in  cellular  networks  and 
verification  through  simple  simulations,  Manjrekar  et  al.  discover  that  as  the  number  of 
servers  approaches  infinity,  performing  a  second-price  auction  at  each  server  does  in  fact 
perform  as  well  as  a  queue-length-based  scheduling  algorithm.  A  natural  extension  of  this 
work  would  be  to  add  classifications  of  different  types  of  applications.  In  practice,  requests 
from  different  applications  have  different  cost  functions  and  arrival  rates.  This  simple 
modification  of  the  model  could  add  great  insight  into  this  area  of  study. 

6  Conclusions  and  Further  Work 

We  have  shown  that  when  a  cost  is  assigned  to  polling  in  a  load  balancing  problem,  imple¬ 
menting  a  threshold  to  dynamically  determine  the  number  of  queues  to  sample  significantly 
reduces  the  expected  total  cost  of  customer  i.  For  all  A  and  sampling  cost  greater  than 
zero,  the  threshold  model  always  performs  better  than  the  corresponding  case  without  a 
threshold.  The  optimal  thresholds  found,  for  a  given  A  and  cost  structure,  coincide  with 
the  system’s  social  optimum.  For  certain  given  “actual”  marginal  costs  of  sampling  cs, 
there  do  exist  corresponding  “imposed”  marginal  costs  of  sampling  cs  such  that  the  social 
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optimum  for  cs  coincides  with  the  Nash  equilibrium  for  cs.  This  implies  that  a  tax  (or 
subsidy),  calculated  from  the  difference  cs  —  cs,  can  be  levied  on  individuals  such  that  the 
social  optimum  for  cs  coincides  with  the  Nash  equilibrium  for  cs.  Because  sampling  does 
realistically  cost  the  system,  we  believe  this  model  will  prove  useful  in  the  design  of  many 
load  balancing  systems. 

Still,  several  open  questions  remain.  The  most  obvious  questions  seek  to  discover  ana¬ 
lytic  solutions  to  the  cost  optimization  and  pure  strategy  Nash  equilibrium  problems.  The 
most  interesting  expand  upon  the  described  threshold  supermarket  game.  How  do  the 
results  change  when  mixed  strategies  are  considered?  Would  incentives  truly  push  the  sys¬ 
tem  into  the  socially  optimal  equilibrium  in  practice?  Many  implications  of  our  threshold 
model  and  game  remain  to  be  explored. 
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